This tutorial is a basic introduction to Python for anyone in the heritage community. It is an updated version of the Introduction to Coding appendix in my book Open Heritage Data.

What is Python?:

Python is a general-purpose open source programming language released in 1991 (python.org). It is often described as fast, friendly and easy to learn. While it can be used for a variety of applications, websites and games it is also popular for use in data science because of its data manipulation tools and the relatively gentle learning curve. Python contains a variety of different modules and packages, providing many specialised and ready to use functions. This means that you can do a lot of data manipulation with few lines of code.

Why do I need Python?:

I introduced PHP as a programming language suited to publishing whole heritage datasets in an online environment. However, Python is a more suitable choice if you wish to analyse a dataset for research purposes or in order to visualise a dataset for publication. Because of the extensive code libraries available you can go from basic Python knowledge to doing more advanced data manipulations in a few steps. Thus you don’t need a computer science degree in order to use Python for your research.

For an example of this see the tutorial using a dataset of 19th century dogs which I created for my talk at the University of Edinburgh Centre for Data, Culture & SocietySkills in Heritage Data Science: Meet the Dogs of 19th Century Denmark.

Example of Python code:

# first we define two variables, calculate a third and print the result (11)
x = 3
y = 8
z = x + y
print(z)

# print the data type of the x variable, result is int
print(type(x))

# create a list of artists
artist = ['Anna', 'Marie', 'Anne']

# make a for loop and print each of the values in the list
for x in artist:
print(x)

Getting started with Python

All code needs to “run” in an environment that understands the language. Take HTML as an example – if you open HTML code in MS Word, you will probably just see the code as plain text. But if you open the code in a browser, it will render your code into website elements. The same goes for Python – it needs to run in an environment/program that understands the Python commands.

There are many options for this. You can install Python with an environment on your own computer – but I would suggest that you save this option for later. There are also online Python environments/editors/compilers/etc. available. Here I will use the Google Colab environment as it is a part of the Google Drive setup which many are already familiar with.

Use this link: https://colab.research.google.com and depending on whether you are already logged into a Google/Gmail account or not (you will need to log in first) you will see some sort of welcome page. You can start by making a NEW NOTEBOOK. A notebook in Python is a file format (.ipynb at the end) which consists of cells of either text or code.

You can have a look at my beginners workbook here!

Python absolute beginners cheat sheet:

[This is still a work in progress]

This one is for my 2020 students who felt that all online cheat sheets for Python beginners were anything but – here is an absolute beginners cheat sheet.
Check out the accompanying beginners workbook to see it in action.

Variable – containers for data (text with quotation marks: x = “hello”, numbers without: y = 5)

Most used functions:
print( ) – outputs the content inside