Introduction to Microsoft Fabric: using Notebook to load and transform data into Lakehouse
Whats up guys?!!!
Here is a post on Microsoft Fabric: which is an unified platform for data engineering, data warehousing, and business intelligence and essentially an one stop solution to facilitate end-to-end analytics and data platform designed for enterprises that require a unified solution, as well as to satiate all your need for data movement, processing, ingestion, transformation, real-time event routing, and report building. This blog could give you an idea to ingest and transform data from an external application (e.g. website/API/files or could be an Azure storage account) and load them into Lakehouse.
Ok: let's begin by creating a free account for Microsoft Fabric. Start by signing up for a dummy instance for D365: https://learn.microsoft.com/en-us/dynamics365/project-operations/environment/admin-trials
This will create a D365CE-F&O instance as along with a litter of utilities like:
a. Portal Azure login: although it will be an entirely a zero credit instance, however you can still create Entra IDs from here. This can help you to connect to F&O instance (if need be).
b. You can connect to Power-platform utilities: creating Power apps, Flows and Power BI.
c. Create a Free access to Microsoft Fabric, by visiting: https://app.fabric.microsoft.com/ , using the ID that will be cerated from the above. This will create neecessary capacities for Microsoft Fabric, with enough provisions to let you do enough of R&Ds for 60 days ✌✌✌
Once you login, your ladning page should look like:
Look at the URL from the address bar. Note that, I have purposely kept the experience= data-engineering. You can change this to PowerBI, if you wish to explore Power BI capabilities of MS Fabric.
Once done, you can create a new Workspace from as follows:
Next, let us create a notebook for your Lakehouse. Click on New and select NoteBookAs usual give a proper name to your Notebook >> Click Create. This will automarically create a Notebook, under the Lakehouse created from the above step.
All set: let us now give you a background on what we will be doing. We need a dataset, to demo you how to bring the data from the said dataset and load it in as a file to the Lakehouse. For our demo, let us take it from Kaggle: a gigantic storehouse of data for data enfgineering. You can create an account here, and yes: it's free.
For our testing purpose, I will be using data from Kaggle gym data, which could be found as under:
https://www.kaggle.com/datasets/valakhorasani/gym-members-exercise-dataset
Click on your name icon >>
Under API, click on Create New Token:
This will create new token for your User name and will get downloaded as a file in your local computer.
Click on the highlighted Run button, to start your session and work on the commnd you have given to start on.
And just under the File (once you do a refresh), you would see the Gym-member-exercise file has been uploaded too:
You can now inspect the content of the file, by insepcting the data, by loading into a Dataframe:
Comments
Post a Comment