Variables and Data

Variables store information. All modeling and analytics requires data. Once the data is input into Python, we can use statistical techniques to extract actionable information from the raw data.

Inputting and Printing Data

Below, variable a is defined to store the number 5. Next, variable b is defined to store 5 (the value of a) plus 2, which is 7. The sum of a and b is confirmed on line three of the following code cell. If we run the cell, Python prints out 7.

a = 5
b = a + 2

By default, the last line of the cell will print out information to the user. For example, to see the value of a, we could simply type:


This approach is fairly limited. If we want to see both the value of a and the value of b, the following will not achieve our goal:


Remember, only the last line of code will print out. In order to print multiple pieces of information, we use the print() function. A function in Python is something that we will discuss in greater detail later on in this section. For now, note that print(a) will print the value of a and print(b) will print the value of b.


Printing is simple, but remarkably important. We will be feeding Python information to process (e.g. a firm’s financial statements, a consumer’s credit history, a set of stock returns, etc.). Printing is useful to (1) verify that the information is correct in Python and (2) see the result of our data manipulation. For instance, suppose that we feed Python a portfolio of individual stock returns. After instructing Python on how to do some statistical analysis, we will be able to calculate the portfolio’s Sharpe ratio. That information needs to be printed out back to us, the user, in order to know whether the portfolio does well (in a risk-return sense).

Note that the last line of your code cell will always print out, even if it is redundant.


That last line of code will not print out anything, however, if there is nothing that warrants printing. This statement will make a little bit more sense later on once we talk about functions in more detail. For now, observe that if we define a variable (e.g. c = 8) then nothing will print out.

c = 8

Concept check: In the following code cell, perform four tasks (each task should be done on a new line of code). First, define a variable net_income to be equal to 10. Second, define a variable total_assets equal to 30. Third, define a variable roa equal to net_income divided by total_assets. Fourth, print out the value of roa. Hint: division in Python is done with the / character, so if we set x equal to \(1\) divided by \(3\) we would type x = 1/3.

Naming Variables and Adding Comments

Variable names must start with a letter, and can contain letters, numbers, and underscores. In general, variables should be named in a descriptive manner. For example, if you have information about a firm’s cost of capital, storing that information in a variable named x is a horrible idea.

You should always strive to make your code readable. What this means is that it should be relatively easy to open up code that your wrote earlier and know what it does. Adding inteligent variable names helps tremendously. Consider the following example:

a = b / (c+b) * e + c / (c+b) * f * (1-g)

Can you tell what the above line of code does? If you live and breathe finance formulas, the pattern of the equation above may be familiar. We can all agree, however, that the mystery is much easier to solve if we simply give the variables more meaningful names:

wacc = equity / (debt+equity) * cost_of_equity + debt / (debt+equity) * cost_of_debt * (1 - tax_rate)

One method of making long variable names readable is, as shown above, to space out words via underscores.

That is, if you have a variable that contains a firm’s cost of capital as its information, it is better to name this something like cost_of_capital or costOfCapital rather than a non-descriptive name like a or b.

Note that you can use # to add comments to your code.

Python will ignore everything on a line following a # symbol.

accountsreceiveable = 10

Even for short(ish) phrases like “accounts receiveable”, squishing all of the characters together into one long word renders the variable name difficult to read. The underscore naming method would use:

accounts_receiveable = 10

which is much easier to read. Others prefer the “camel hump” method:

accountsReceiveable = 10

and whether you use underscores or camel humps is down to personal preference. Using neither underscores nor humps, however, violates rules of good programming.

Let’s return to example formula

wacc = equity / (debt+equity) * cost_of_equity + debt / (debt+equity) * cost_of_debt * (1 - tax_rate)

Besides good variable naming, we can make the code more readable by including comments. Comments are notes that programmers leave inside code. These notes are not read by Python when Python goes through to run the code that’s been written. The following code cell will not run successfully.

define c equal a plus b
c = a + b
  File "<ipython-input-10-a8368e710330>", line 1
    define c equal a plus b
SyntaxError: invalid syntax

The code fails because the statement

define c to equal a plus b

is something that Python thinks is code. To transform the note into a comment so that Python ignores it, we use the # character. Python ignores the # character and everything that follows it. Use this to take notes about what’s happening in the code.

# define c to equal a plus b
c = a+b

Using comments, the “nicest” version of the example formula is

# compute the weighted average cost of capital
wacc = equity / (debt+equity) * cost_of_equity + debt / (debt+equity) * cost_of_debt * (1 - tax_rate)

Note that comments can appear on the same line as some Python code. When this happens, Python stops reading the line of code at the # character.

c = a + b # let c equal a plus b

Concept check: In the following code cell, perform 6 tasks (each task should be its own line of code). First, enter a comment that reads “apply the Gordon growth model”. Second, define a variable d equal to 5 and add a comment at the end of the line that says “dividend”. Third, define a variable r equal to 0.1 and add a comment at the end of the line that says “discount rate”. Fourth, define a variable g equal to 0.02 and add a comment at the end of the line that says “dividend growth rate”. Fifth, define a variable v equal to d / (r-g). Sixth, print out the value of v.

Basic Variable Types

There are a handful of basic variable types.

a = 5        # a is an integer (there is no decimal point)
b = 1.5      # floating point number (in general, don't worry about integer vs. floating point)
c = False    # c is a Boolean
d = 'banana' # d is a string

We won’t worry too much about the distinction between integers and floating points. To us, both are just numbers.

Boolean variable types, named after George Boole, are either equal to True or False (note the capital T and F). We will see some examples of Boolean values later on in this section.

Strings store characters (e.g. single values like 'c' or '%') and words (combinations of characters like 'cat' or 'dog'). The characters/words in strings are encased in either single quotes like 'cat' or double quotes like "dog". In Python, whether you use single quotes or double quotes usually doesn’t matter.

Concept check: In the following code cell, complete each comment to indicate the likely variable type of a given piece of information. The first line is completed for you, as an example.

# shares outstanding: integer
# cost of equity capital:
# leverage ratio:
# stock ticker symbol:
# number of options awarded to the CEO:
# amount of assets under management:
# name of the company's largest shareholder:
# number of stocks in the portfolio:

Programming is useful because variables can store lots of information (more than just a single number or word!).

e = [1,2,3]  # e is a list
f = {'word 1':'definition 1', 'word 2':'definition 2'} # f is a dictionary

Above, variable e stores information about multiple numbers. Lists can store lots of things, not just numbers. For example, the list fang = ['FB', 'AMZN', 'NFLX', 'GOOG'] stores a list of string variables, where each string corresponds to one of the ticker symbols for the FANG stocks.

Above, variable f stores information about words and their meanings. Like lists, dictionaries can store lots of things. For example FB = {'3/23/20':148.10, '3/24/20':160.98, '3/25/20':156.21, '3/26/20':163.34, '3/27/20':156.79} records daily stock price information for Facebook, Inc. for the week of March 23, 2020. Dictionary variables hold keys (e.g. a word to look up) and values (e.g. the definition of the word). Here, the key is a string that records the date, and the value is a floating point number that corresponds to the stock price.

We’ll return to lists and dictionaries later, including a discussion of how to modify/update data inside the list/dictionary, as well as instructions for accessing single items within a list/dictionary. For now, we’ll work with variables that hold single pieces of information for the sake of simplicity. But note that variables can store lots of data simultaneously, and that’s what makes programming truly useful for financial practioners.