Creating a dataset using Octopub couldn't be easier. Before getting started, make sure you've got a Github account. If you haven't got one, you can sign up at Once you've got an account, follow along with our video tutorial or via the steps below. follow our simple step by step guide to creating your first dataset.

Step by Step: Using Octopub

First, go to the homepage and click 'Sign in with Github'

Step 1: Nav to Homepage

If you’re not already logged into Github, you’ll be prompted to sign in to your Github account.

Step 2: Sign In To Github

Once you're signed in, you’ll see a screen like this, asking you to authorise your account with Octopub. We need these permissions to create and update datasets in Github on your behalf, and won’t use them for any other purpose:

Step 3: Authorise Octopub with Github

Next, click ‘Authorize application’.You’ll then be redirected to Octopub with a message telling you you’re logged in.

Step 4: You have successfully Authorized Octopub with Github

Next, under the ‘Datasets’ menu, click ‘Add a dataset’. You’ll then be redirected to the dataset creation page:

Step 5: Add a dataset

The first thing you’re asked for is the title of the dataset. Keep this short, sweet and descriptive, telling users what the dataset is about.

Step 7: Descriptively name a dataset

Next is the description, you can use this to give extra context to the data. This can be as long or as short as you'd like

Step 8: Add a description to your dataset

Now, provide the name and web address of the person of organisation who is publishing the data. Data users like this, so they know who to contact with any questions.

Step 9: Add publication details to your dataset

Next, you can choose the license. This lets people know what they can or can't do with the data

Step 10: Add license details to your dataset

The license options are as follows:

Creative Commons Attribution 4.0

Choose this license if you want anyone to use your data on the understanding that they credit you or your organisation

Creative Commons Attribution Share-Alike 4.0

Choose this license if you want anyone to use your data with the same restrictions as above, with the added requirement that anything created using your data is released under the same license

CC0 1.0

Choose this license if you want anyone to use your data with no restrictions

Open Government License 3.0 (United Kingdom)

This license is same as Creative Commons Attribution, but created for UK government. Choose this license if the data you are creating is published by a UK national or local government agency

Open Data Commons Attribution License 1.0

This license is similar to Creative Commons Attribution, but applies specifically to data

Open Data Commons Public Domain Dedication and License 1.0

This license is similar to CC0, but applies specifically to data

For more information of open data licensing, see the ODI’s guide to open data licensing.

You can also choose whether to publish your data privately or publicly. Only people with paid Github accounts can publish privately, so leave this to 'yes' for now.

Step 11: Determine visibility of your dataset

Next, choose how often you think the data will be updated - this lets data reusers know if this data is likely to change

Step 12: Add publication frequency details to your dataset

Now comes to nitty gritty part, adding the data. Choose a title and description for your data file, then click 'Choose file' to choose a CSV file to upload.

Step 13: upload data files

Once the file has uploaded, you can scroll down and click 'Submit' to create your dataset. You'll then see a notice telling you your data has been queued for creation. Within a few minutes, you should recieve an email, with a link allowing you to see your dataset in all its glory!