How to upload Dataset on Huggingface using Python Notebook ?

Shashi Vishwakarma
2 min readOct 17, 2023

Hello All , Welcome new post where we will learn today how to create a new dataset on huggingface dataset repository and upload file from your local to huggingface dataset.

Photo by Nikolay Kovalenko on Unsplash

Pre-requisite :
1. Make sure you have Huggingface account.
2. Jupyter Notebook where you can run python code.

Lets install all dependencies which we need for uploading dataset in huggingface.

! pip install datasets
! pip install huggingface_hub

Import Libraries and login to Huggingface.

from huggingface_hub import login
from datasets import Dataset
login()

After running above code , it will prompt you for token of your huggingface account. You can get token from under Settings->Access Token. Create a new token with write Access permission.

Once you have got token , enter into above text field and click on Login to authenticate.

Now its time to create dataset object from pandas dataframe. I have a very simple csv file with two column ID , Student and I will load this file into pandas dataframe .

import pandas as pd
input = pd.read_csv("data.csv")
input
Sample Dataset

Lets create dataset object from Pandas dataframe. I am going to split dataframe into test and train i.e. 70% for training and 30% for testing.

dataset = Dataset.from_pandas(input)
dataset = dataset.train_test_split(test_size=0.3)

# Print the dataset
print(dataset)
DatasetDict({
train: Dataset({
features: ['ID', ' Student'],
num_rows: 7
})
test: Dataset({
features: ['ID', ' Student'],
num_rows: 3
})
})

You can observe from above , DatasetDict have been created with train and test.

Now final step is to upload dataset but before that we will have to create repository on Huggingface.

Go to your hugginface profile > New dataset and create new dataset with above screen.

Once dataset repository has been created , run below command to upload dataset on huggingface.

dataset.push_to_hub("<YourProfileName>/<YourDatasetRepoName>" )

Great…!!! You have successfully published your first dataset on huggingface.

Youtube Video Link

Keep Learning ..!! Keep Sharing ..!!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Shashi Vishwakarma
Shashi Vishwakarma

Written by Shashi Vishwakarma

Senior Software/AI Engineer , Technical Writer

No responses yet

Write a response