Podcast Summary: Neuroscience meets Psychology

Summary

In this podcast dr. Andrew Huberman and dr. Jordan Peterson discuss and try to explain various real-life phenomena from the viewpoints of psychology and neuroscience.

  • Affirmations really do work because they capture the fundamental principle of our neurology works.
  • Cleaning your desk before starting to work on a hard task (or doing your bed before the hard day) really helps to increase energy, focus, and motivation.

Notes

Autonomic Nervous System

  • 4:10 Autonomic Nervous System (automatic).
    • Controls vegetative functions
      • rate of digestion
      • stuff that keeps you urinating while you’re asleep
    • Peterson: all the things that are too complex for us to think through
  • 3 main types of body-to-brain signalling
    • heart-rate (fastening or slowing)
    • gut (empty or full)
    • breath (rate, depth)
    • There are chemical and mechanical signaling from the body to the brain
  • Parasympathetic and sympathetic systems
    • Seesaw of alertness and calmness
  • What is interesting is our interpretation of these signals and how it relates to anxiety and exploratory behavior and where the nodes of control are.
    • It’s like a hinge on a seesaw
    • For some people the balance is shifted towards alertness, so they might often feel that their overall level of autonomic arousal is inappropriate for the demands of their life
      • i.e. their heart is racing
    • Other people may feel under-energized and exhausted
    • Both states originate in the autonomic nervous system
  • The area of the brain related to this regulation is the Prefrontal Cortex and in particular the left dorsolateral prefrontal cortex.
    • It has a direct connection to 2 brain regions that are critical to the issue of if you «feel right»: Anterior Cingulate Cortex and Insula
      • Insula is responsible for interpreting incoming bodily signals
        • Also coming to Insula are the signals from Amygdala
  • 11:52 Body reports to conscious attention
    • It is Insula reports on the nature of bodily states to the prefrontal cortex in a manner that allows us to be consciously aware of our body states
    • We can take out physiological state into account when we envision plans
      • Part of the function of the dorsolateral prefrontal cortex is allowing us to envision different versions of the future (those are plans).

Prefrontal Cortex

  • Prefrontal Cortex is a flexible rule-setting structure
    • we apply different rules in different contexts
    • both Insula and Prefrontal Cortex are involved in this «conversation» that establishes which rules are appropriate for a given situation
      • prefrontal cortex is switching sensitive behavioral patterns
    • Left dorsolateral Prefrontal Cortex can access memory from Hippocampus: «Last time I responded like I didn’t get the results I wanted»
    • also Left Dorsolateral Prefrontal Cortex is connected to structures that then feed into the Vagus Nerve, which can slow the heart rate down.
      • as Left Dorsolateral Prefrontal Cortex acquires a new rule set it sends a signal to slow the heart down through the Vagus Nerve
  • Prefrontal Cortex is not suppressing, but conducting lower level signals
  • Normally the Prefrontal Cortex is leading the response of the Insula.
    • If a person is depressed or in a state of chronic anxiety, then Insula and Anterior Cingulate Cortex starts to lead.
    • We see it in dysregulated arguments: people default to the singular and primitive rule set
  • 22:50 The line between anxiety and exploratory behavior:
    • Autonomic arousal (tendency to be more alert) is a healthy response
  • 23:10 At the moment adrenaline is released from the adrenals, there is a parallel signal in the brain. Locus coeruleus sprinkles the entire brain with adrenaline.

Emotional States

  • 24:10 If a person has a lesion of the Dorsolateral Prefrontal Cortex or if it is otherwise inactivated, the person becomes incredibly accurate at any motor task but loses the ability to establish rules.
    • i.e. a person can shoot very accurately in a computer game, but can’t tell if a target is the enemy of a friend.
    • as a purely sensory motor response machine, the Prefrontal Cortex is not necessary and even detrimental
    • If you get rid of a Prefrontal Cortex, everything becomes a stimulus
      • For a baby everything is a stimulus
        • 2-year-olds just cycle through innate motivational states.
          • When they are interested in something, they are 100% interested. When they are angry, they are 100% angry. If they are tired, they instantly fall into a coma.
            • It’s because they don’t manifest any integrating Prefrontal Cortex activity until they are 3 when they can start engaging in joint play states with other children. Then they can modulate their underlying emotions with the abstract representation or goal which is sometimes jointly shared.
        • Adults become infant-like in their responses when anxiety is high
          • Peterson: Anxiety simplifies us. If we are unable to compute a complex and sophisticated pathway forward that takes multiple variables into account simultaneously, we can’t just do nothing. We’re going to default to some primordial and direct state.

Future Selves

  • 28:10 Prefrontal Cortex generates potential abstract patterns of action, so they can be assessed before they are implemented. It’s like it generates potential future selves.
  • Prefrontal Cortex is connected to priors via the hippocampus, it can take into account the current state, i.e. do I have the energy to undergo a particular pattern (through Insula, and Anterior Cingulate Cortex).
  • Dorsolateral Prefrontal Cortex also can control the body:
    • default of neural inputs to the heart and the breathing system is to be very activated
    • brain provides a suppressing or breaking to those signals
  • 31:07. The default reproductive strategy of the animal that has no behavioral flexibility is to have as many offsprings as possible, to spread out.
    • The reason is that all variability in animal behavior is genetically coded. For it to adapt, it has to produce a lot of copies of itself, most of which die.
    • Humans evolved a mechanism for manufacturing artificial selves, so we can put forward fictional selves in abstraction and kill them off when they are not necessary without us dying.
    • When we are describing these abstract artificial selves (avatars) (alternative modes of action), we are telling the stories.
      • Huberman: Prefrontal Cortex is a rule-changing alternate self-accessing machine that can also calm the body
      • Imagining different selves and different outcomes require that we suppress how we feel in our body at the moment.
        • To change the rules we need to imagine how we would feel when we complete an action, so we need to suppress our current feeling
  • 37:40 When people are confronted with anxiety-provoking scenarios (via virtual reality), the pause response is associated with autonomic arousal (stress), but it was the lowest anxiety response.
    • Retreat was the next level in terms of heart rate change and changes in the Insula.
    • And then there was a subset of individuals, who would confront the fear. Not reflexively, but after some consideration, they would lean into the challenge. This response is associated with the highest levels of arousal.
    • A subset of people will be very scared, but they still will march on and even explore jumping from a great height (knowing that it is virtual). They also demonstrate changes in the Insula activity and breathing.
      • Stimulating Nucleus Reunions in the midline Thalamus converted mice from just scared to also willing to confront their fears strategically.
      • If it is stimulated without any fear stimulus, subjects (both mice and people) love this feeling
      • Nucleus Reunions connected directly to major hubs of Dopamine release.

Dopaminergic System

  • Huberman: We have one major reward system — dopaminergic.
  • The subjective feeling reported by people after the stimulation was «mild frustration, the anticipation of something (though they didn’t know what)»
    • It’s like «something good is going to happen»
    • it’s a «hope system»
    • it’s not about having, it’s about wanting
    • it’s not about pleasure, it’s about craving, motivation, and drive
  • The Dopamine system is in contact with the autonomic system
  • Prefrontal Cortex is a part of the dopamine reward system
    • Prefrontal Cortex projects different selves into the future
  • Reward Prediction Error
    • If you are expecting something, some dopamine is generated
    • If the expected never arrives, the disappointment makes you feel much sadder
  • When you anticipate something, the dopamine raises
    • When you get what you want, the dopamine level raises even more, but then it decreases below baseline
      • it is the basis of addiction
    • the reward is not getting the ice cream, but just before getting it when you that you are going to get it
  • 50:00 How addicted sub-personality grows:
    • somewhere down the line there is a certain state of mind that you are in (i.e. nihilistic hopelessness)
    • the state grips you and motivates you to seek out your favorite drug
    • dopaminergic reinforcement produced by the drug reinforces not only the craving for the drug itself but the initial state as well
  • When the anticipated event never comes, the dopamine level drops way below the baseline, and if after the drop the event happens, it produces an even stronger kick.
  • 52:20 If you are highly anticipatory and it doesn’t make itself manifest, then you are seriously wrong, so you are going to take an emotional hit
    • The pain might be associated with the beginning stages of the death of the systems that mediated this initial response
    • Because you should eradicate systems that make you anticipate that don’t want, but they are in some sense already alive. You’re probably going to pay a price for something approximating pain.
  • 53:30 depressive cascade
    • you anticipate something and you make a mistake
    • how significant is the mistake?
      • «oh well, it can happen to anybody»
      • «I’m near 50 years old, I should be much more responsible, there is something wrong with me as a person»
      • a depressive person can go even further:
        • not only there is something wrong with me in this decision
        • this decision is just like any other decision I make
        • I never make a good decision
        • in the past I never made a good decision
        • there is no way I’m going to change in the future
    • how to evaluate a mistake
      • «here are all things you do right, but in this case here is the minimal thing you did incorrectly and this is how to alter it»
      • the higher a person in neuroticism, the higher probability of the cascade
      • errors are due to state, not trait
      • make it as local as you can
      • before evaluating a mistake you want to remove yourself from the state of rage or anxiety because those states are very low resolution and push you towards global accusations
      • what is the minimal possible behavioral transformation that insures that similar mistakes don’t happen in the future
      • If your roof is leaking, you don’t have to rebuild the house from scratch
  • The subjective effect of dopamine is caused not by the absolute amount, but by the delta between the baseline and the state
  • 1:00:00 some people tend to overinflate their wins — manics
    • For a manic every one of their possible selves is wonderful
      • Every possibility gives them a dopaminergic kick
      • You shouldn’t be positive about everything
      • When a system loses its ability to focus and discriminate then it becomes pathological
      • Someone, who is manic is a different person every second
      • Dopaminergic system is about the pursuit, not the outcome
        • When the dopamine is elevated it puts our perspective outside of our body (that person, that lover, that food, that target)
          • manics are all about plans — I’m going to do this and that — and not about the execution
  • The opposite is in neuroticism
    • self-consciousness loads heavily on neuroticism
      • When you fall into anxiety, there is an internal obsessiveness — which parts of me are malfunctioning and need to be eradicated?
    • if a socially anxious person goes into a social gathering, they are so focused on their internal sensations, they fail to make eye contact — conversation fails and they fall into a spiral
      • solution — don’t calm yourself down, but calm the other person down — focus your attention outward
  • 1:05:37 — The data is showing that overly anxious people are too much in touch with their bodily signals (they can count their heartbeats) — they are best to avoid focusing on inward focusing meditations
    • On the other hand they might just be gripped by their thoughts.
    • If they do this voluntarily, it activates another system
    • This is why exposure therapy works
      • I’m afraid of something
      • If I go near it, then I’m possessed with negative emotions
      • Thats' if you go near it accidentally
      • If you do it purposefully the response will be quelled.
    • People don’t become less afraid, they become braver instead

Short and Long Term Effects of Dopamine on the body

  • 1:08:40 There are 2 modes of changing responses
    • neural plasticity
    • adding new cells or rewiring
  • Any system that taps into the dopamine system is highly subject to reward neural plasticity ^daa4d3
    • If you give somebody a drug that increases dopamine (and) for the next 1–4 hours the neural plasticity is increased — it takes fewer repetitions to create a permanent shift in a neurology
    • If you believe that you are doing something important and desirable, you are approaching the valued goal, you have a lot of anticipation as a consequence of that, you put yourself into a neuro-chemical state that facilitates learning
      • Woo statements like the Secret, affirmations stuff do work because they capture the way our neurology works: ^9d995c
        • Prefrontal Cortex as a flexible rule-setting machine that taps into the dopamine system can adopt new rules for reward release in the brain — hence improved neural plasticity
  • When you are writing an essay, you have to start with having a question in mind that you regard finding the answer to worthwhile, otherwise the whole exercise is a lie.
  • 1:16:00 Dopamine system is depletable but renewable and self-amplifying.
    • If I don’t have a specific goal, when I do even a menial task like making a coffee, it completes a dopaminergic circuit and results in a release of dopamine which in turn amplifies our ability to think into the future, to make additional plans and increases confidence and energy. ^e69c88
      • Adrenaline is neural energy, it gives us the ability to get up and go. Adrenaline is manufactured from dopamine.
  • Neuroplasticity works on a short scale. The slower system is a hormonal control
    • Testosterone and Estrogen are both secreted when the dopamine system is activated.
      • Dopaminergic neurons with Pituitary Gland which releases gonadotropins, luteinizing hormone, stimulate the testis and ovaries to release the hormones
      • Sex steroid hormones are vital not only for reproductive but also for motivational biology.
        • Steroid hormones are lipophilic, they can go to the nucleus of a cell and control gene expression.
      • If you achieve wins repeatedly, testosterone is the molecule that controls not only immediate cell function but also gene expression.
  • Cognitive appraisal is critical:
    • «I’m someone who can get things done (even if they are small)»
    • «Small things are not small»
  • Experiment:
    • 2 rats (or people): one normal, one with depleted dopamine
    • If you allow them to experience something pleasurable, they both will enjoy it
    • If you put any kind of a task between an animal and the reward, the animal with depleted dopamine will not do the task
  • Anxiety is a natural system of getting you to move, it’s a «bias towards action».

Effects of Pornography

  • Dopaminergic system is generalizable to many different behaviors.
  • Ability to access repeat dopamine surges without any effort or directed pursuit is pathological.
    • For example cocaine increases dopamine, but the only system that gets rewarded is the drug-seeking behavior.
  • After the orgasm a great amount of prolactin is released which blocks the release of dopamine and testosterone for a long time.
  • 1:37:00 Pornography tends to reinforce circuits related to watching other people having sex, not engaging in the activity in the first place.
  • Masturbation leaves an open loop of neural chemicals including oxytocin and prolactin. Dopamine increases during the pursuit, then it peaks and then drops below the baseline after orgasm and ejaculation. This puts a person in an unmotivated state. It depletes the dopamine system.

My Top 10 Books

Today a friend asked me what they should read to learn more about product marketing. I said I didn’t know much about product marketing, but I listed a few books that have made a big difference in the way I view the world over the past few years.

A friend was very appreciative of the list and said it was just the thing, so I thought I’d post the list on the blog so it would be easier to answer questions like this in the future by just sending a link.

So here is the list of my top 10 books, broken down into 3 categories: learning, life in general, and wealth creation.

On Learning

How to Read a Book
by Mortimer J. Adler and Charles Van Doren

How to Read a Book

The best books are generally so packed with information that it’s hard to digest them. Some time ago I read 40 books in a year. But then I found that at best I could only remember 1% of the material. There’s no point in investing the time when knowledge retention is so low.

This book helps to solve that problem. It contains a toolkit for extracting the most knowledge and understanding from books.

How to Take Smart Notes
by Sönke Ahrens

How to Take Smart Notes

A great book about the Zettlekasten note taking method.

It’s closely related to «How to Read a Book." It addresses the same problem: how to get the most out of reading books, watching videos, and consuming information through other media.

A Mind for Numbers
by Barbara Oakley

A Mind for Numbers

This book contains a set of principles and protocols to learn anything. Everything I have learned in the last five or seven years I owe to this book. It really taught me how to learn.

Notable examples: the difference between focused and diffused learning, how to avoid the illusion of competence, how to prevent procrastination.

On Life in General

Thinking, Fast and Slow
by Daniel Kahneman

Thinking, Fast and Slow

Dany Kahneman is a Nobel Prize winner for economics in 2003. In last 30 years he and Amos Tversky have revolutionised our understanding of how people make decisions. They have discovered such cognitive biases as loss aversion, anchoring, halo effect, availability heuristic, outcome bias, planning fallacy and many more.

This book has opened my eyes on how I and other people think and make decisions in every day life. I reread parts of it almost every year.

Incerto
by Nicolas Taleb

Incerto

Well, to be honest it’s not a book, but 5 books, connected by a theme: understanding of the world through the lenses of risk, luck, uncertainty and probability. If I need to recommend a single book of the series, I would pick «Antifragile», but it is worth to read all of them.

Behave
by Robert Sapolsky

Behave

Dr. Sapolsky is a professor of neurology at Stanford University, author of the amazing lecture series on human behavioral biology.

At first glance, the subject of the book looks very simple: Let’s imagine that a behavior has taken place. Someone has done something. Why did they do it?

The book tries to answer this question on all possible time scales, from microseconds before the act, to minutes before the act, to years and centuries.

On Wealth Creation

4-Hour Workweek
by Tim Ferris

4-Hour Workweek

This is a great book with a very bad title. It gives the impression that it is about how to work less. It is not. It is about a mentality of maximizing freedom and happiness by achieving financial independence.

It’s a very practical book, with a number of tools and practices that have helped me a lot. My workweek is still about 70 hours a week, but I love every moment of it.

The Almanack of Naval Ravikant
by Eric Jorgenson and Tim Ferris

The Almanack of Naval Ravikant

This is a compilation of tweets, podcasts, and other conversations with Naval Ravikant about building wealth and happiness. Naval is a founder of angellist.co, entrepreneur, investor, and philosopher. Naval knows what he’s talking about.

Show Your Work
by Austin Kleon

Show Your Work

It is a very short book on why and how to create and maintain a personal blog. It can be read in about 2 hours and is full of practical advice on how to start writing and maintain the process over time. It has helped me and many of my friends to post regularly online and gradually build a social following around our work.

Launch
by Jeff Walker

Launch

If «Show your Work» helps you build an online following, this book helps you build a business based on that following. It focuses on selling information products, but can work for other types of products as well. I’ve some experience in online marketing and can confirm that the strategy is legit.

The only warning I’d give isn’t to pay too much attention to the first 3 chapters. The author tells his story and it’s very much like the cliché «I was broke and didn’t know what to do! If someone asked me back then that I’d make millions online, I wouldn’t believe it! But then…». Don’t skip the chapters, but don’t pay too much attention to them. The good stuff starts later and it’s worth it.

Does Culture Eat Strategy for Breakfast?

What is culture?

People are conscious. Occasionally. Consciousness is great for problem solving, but not for execution. We learn concepts and skills consciously, but later apply them without explicit thought.

At least while the skill works. When we encounter an obstacle, consciousness returns, deals with the problem, and then fades back into the background. See "System 1" and "System 2" by D. Kahneman.

At the individual level, the conversion of conscious to unconscious is called skill, habit, or intuition. In a family or tribe, it is called tradition. In a society it becomes culture (see A Hunter-Gatherer’s Guide to the 21st Century).

A corporate culture is the same thing: a set of unconscious values and rules that guide people’s behavior.

Much like an individual, a tribe or a company is "unconscious" most of the time. We do things the same way that worked before. When everything is working, we do not all need to get together and discuss every tiny detail.

But when something goes wrong, we become collectively aware. We share the information and ideas that each member is aware of, and then propose hypotheses, observations, and challenges until we arrive at a new answer. In other words we formulate a strategy.

Then we execute, adapt, and refine the strategy. Over time, it becomes more and more automatic, "unconscious," and becomes part of a culture.

So which is more important, a strategy or a culture? Both. In good times, a culture provides cheap and efficient guidance. When the landscape has changed and the old methods no longer work, a new strategy must be developed and implemented.

The bigger question is how to know if current times are good or not.

How to Make Products

In February 2022, I signed up for Ivan Zamesin’s course on product development. This post is a collection of my notes from the lectures.

Ivan Zamesin

Lecture 1. Why Do People Buy or Do Anything

People make decisions in this way:

  1. first an emotion,
  2. then a decision,
  3. and finally rationalisation.

People’s need can be unconscious, partially conscious and fully conscious.

It’s almost impossible to discover unconscious needs. Other types can be discovered by the interviews.

The task of the product is to understand what pleasant and unpleasant emotions people feel in the situation in question, and then offer a way to reduce unpleasant emotions and increase the number of pleasant ones through the product.

The algorithm for detecting emotions:

  1. Ask about the situation — The courier was late, I stayed hungry.
  2. What’s bad about the courier being late and you staying hungry? — Things didn’t go according to plan! (What was the plan?) — I was planning to finish the call and eat, and now I’ve to wait even longer. I felt upset and annoyed.
  3. Ok. So you were upset and annoyed, right? — Yes
  4. If 10 is the greatest annoyance in the world, how angry were you? — Like 4.
  5. How often do you find yourself in this situations like the one you described?

Estimating the strength of an emotion is only necessary to compare problems with each other. In addition to strength, it’s also important to assess frequency.

Common negative emotions: fear, anger, guilt.

Lecture 2. Intro into Jobs to be Done Framework

How people act

  1. A person is in a certain context and in some emotional state and wants to arrive to some other emotional state.
  2. A person «formulates» a mental request to the brain «please make it so I would arrive to some this emotional state».
  3. A brain tries to guess the best solution.

Definition of the Job

A job is the abstraction on top of aforementioned algorithm:

  • When… (mindset, context, trigger)
  • I want… (a clear result)
  • So that… (new emotional state)
  • Solution… (what do clients hire)
    • Problems with this solution (a problem may exist only in the context of a solution)

A Hierarchy of Jobs

Since each person has different needs, all possible jobs are placed in a hierarchy.

A Hierarchy of Jobs

Each job is placed somewhere on the hierarchy

For example, on the existential level, I want to survive and procreate. So I want to have a happy marriage.

A marriage is happier when we’re well rested, so I want to go on a nice vacation with my wife. To go on a vacation, I need to choose a destination, buy tickets and book a hotel, and then we’ve to go.

During the trip, we’ve to arrive at our destination, get settled in the hotel, have dinner, rest, then go surfing, party and chill. And when the vacation is over, we’ve to go home again.

Job Types

Sequential
Each action presupposes the end of the previous one.
Book hotel → Get to place → Stay → Return.
Cyclic
Performed repeatedly.
Order a cab every morning to get to work.
Viral
Involves other people.
Scheduling a meeting, organising a family outing or party, sending a bill.
Tax
A job that is mandatory in some context.
Going through security at the airport.

JTBD Interviews

There are a variety of different JTBD interviews. Right now we are focusing on two: segment search and product search.

Segment Search

Segment Search Interview Template

In the Segment Search interviews, the interviewer asks about how people have solved a series of high-level jobs over months and years. For example, «Tell me, please, how did you plan your vacation?». The goal is to find the segment of customers with the greatest potential.

Product Search

The Product Search Interview is very similar to the Segment Search Interview, but it focuses on a single selected segment and on examining a series of specific steps that people take to accomplish a job. For example, «How do you usually book hotels?».

👷 The course is still in progress, so more notes will be added to this post after each lecture

A Big Pandas Cheat Sheet

What is Pandas and What is it Used For

Pandas is a Python data library. It makes life easier for analysts: Where 10 lines of code were needed before, one line is now enough.

For example, to read data from a CSV file, in standard Python you must first decide how to store the data, then open the file, read it line by line, separate the values, and strip the data of special characters.

With pandas, things are simpler. First, you do not have to worry about how the data is stored — it’s in the data frame. Second, you only have to write a single command:

Pandas adds new data structures to Python — series and dataframes. Let’s examine them closely.

Data Structures: Series and DataFrames

Series are one-dimensional arrays of data. They are very similar to Python lists, but differ in behavior — operations are applied to the list as a whole, but in series — element by element.

That is, if you multiply a list by 2, you get the same list repeated 2 times.

But if you multiply the series, its length will not change, but the elements will double.

Note the first column of the output. This is the index that stores the addresses of each element in the series. Each element can then be retrieved by accessing the desired address.

Another difference between series and lists is that you can use any values as indexes, which makes the data clearer. Let us say we are analyzing monthly sales. We use the names of the months as indexes, and the values are the sales.

Now we can get the values of each month:

Since a series is a one-dimensional data set, it is convenient to store measurements individually. In practice, however, it makes more sense to group the data together. For example, if we are analyzing monthly sales, it makes sense to see not only revenue, but also the number of products sold, the number of new customers, and the average bill. Dataframes are great for this.

Dataframes are tables. They have rows, columns and cells.

Technically, columns of a dataframe are series. Since columns usually describe the same objects, all columns share the same index.

I will explain how to create dataframes and load data into them in the following chapter.

Creating and DataFrame and Loading Data

Sometimes we don’t know what the data is and can’t specify the structure beforehand. Then it’s convenient to create an empty dataframe and fill it with data later.

And sometimes the data is already there, but stored in a variable from standard Python, such as a dictionary. To create the dataframe, we pass this variable to the same command.

It happens that data is missing in some records. For example, look at the goods_sold list — it contains sales broken down by product category. In the first month we sold cars, computers and software. In the second month there were no cars, but bicycles, and in the third month there were cars again, but no bicycles.

When you load the data into the dataframe, Pandas creates columns for all product categories and fills them with data where possible.

Note that the sales of bicycles in the first and third months are NaN — which stands for Not a Number. This is how Pandas marks missing values.

Now let’s break down how to load data from files. Most often the data is stored in Excel spreadsheets or csv, tsv files.

Excel tables are read with the pd.read_excel() function. Pass the address of the file on your computer and the name of the sheet you want to read as parameters. The command works with both xls and xlsx:

The csv and tsv files are text files with data separated by commas or tabs.

Both are read with the .read_csv() function, the tab character is passed with the sep parameter (from «separator»).

When loading, we can assign a column as an index. Imagine you are loading a table with orders. Each order has its own unique number. If we assign this number as an index, we can then extract the data using the df[order_id] command. Otherwise we have to write the filter df[df['id'] == order_id ].

In a later section, I will explain how to get the data from the dataframes. To assign a column as an index, add the index_col parameter to the read_csv() function, which corresponds to the name of the desired column.

After you load the data into the dataframe, it’s a good idea to examine it — especially if you are not familiar with it.

Exploring the Data

Let’s imagine we’re analyzing the sales of an online store. We have data on orders and customers. Let’s load the file with the sales into the variable orders and let’s specify that the id column should be used as a dataframe index,

Let’s examine four attributes of every dataframe: .shape, .columns, .index and .dtypes.

.shape indicates how many rows and columns the dataframe has. It returns a tuple of values (n_rows, n_columns). The rows come first, then the columns.

Our dataframe has 5009 rows and 5 columns.

Okay, we know the scale. Now we want to see what information is in each column. Use .columns to find out the column names.

Now we see that the table contains the order date, delivery method, customer number and sales.

Use .dtypes to find out the data types in each column and see if they need to be processed. There are cases where numbers are loaded as text. If we try to add two text values '1' + '1', we’ll not get the number 2, but the string '11'.

The object type is text, float64 is a fractional number like 3.14.

With the .index attribute we can see the row names.

As expected, there are order numbers in the index of the dataframe: 100762, 100860, and so on.

In the sales column, the value of each order is stored. To get the range, average and median cost, we use the .describe() method.

Finally, to look at some examples of dataframe entries, we use the .head() command. It returns 6 first records of the dataframe.

Now that we’ve a first idea or dataframes, let’s discuss how to get data out of them.

Getting the Data from DataFrames

You can retrieve data from dataframes in a number of ways: by specifying column and row numbers, by using conditional statements, or by using the query language. In the following chapters I’ll tell you more about each method.

Specifying the right rows and columns

Let’s continue with the analysis of the online store sales that we uploaded in the previous section. Suppose I want to display the sales column. To do this, enclose the column name in square brackets and place it after the name of the dataframe: orders['sales'].

Note that the result of this command is a Series with the same index.

If you need to output multiple columns, insert a list with their names in square brackets: orders[['customer_id', 'sales']]. Caution: The square brackets are now double. The first pair of brackets is from the dataframe, the second pair is from the list.

Let’s move on to the rows. You can filter them by index and rank. For example, if we want to display only orders 100363, 100391 and 100706, there’s a command .loc[] for that.

Another time we want to get only orders 1 to 3 in the order, regardless of their number in the table. Then the command .iloc[] is used.

You can filter dataframes by columns and columns at the same time:

Often you don’t know the ids of the desired orders in advance. For example, if the task is to get orders worth more than $1000. This task can be conveniently solved with the help of conditional operators.

If — then. Conditional operators

Problem: You need to find out where the largest orders come from. Let’s start with all purchases worth more than $1000:

Remember I mentioned at the beginning of this article that in the Series all operations are applied item by item? Well, the orders['sales'] > 1000 operation is applied to every element in the series and, if the condition is met, returns True. If it’s not met, False is returned. The resulting row is stored in the filter_large variable.

The second command filters the rows of the data frame containing the Series. If the filter_large element is True, the row is displayed, if it’s False — not. The result is a data frame with orders worth more than $1000.

I wonder how many expensive orders were delivered with the first class? Let’s add another condition to the filter.

The logic hasn’t changed. In the variable filter_large we store the row that satisfies the condition orders['sales'] > 1000. In filter_first_class — the row that satisfies orders['ship_mode'] == 'First'.

Then we merged both rows using the logical AND: filter_first_class & filter_first_class. We obtained a new Series of the same length, in the elements of which only orders of value greater than 1000, supplied with the first class are True. There can be any number of these conditions.

The Query Language

Another way to solve the previous problem is to use the query language. We write all the conditions in one string 'sales > 1000 & ship_mode == «First»' and pass them to the .query() method. The query is more compact.

Pro tip: you can store filter values in a variable and refer to it in the query with the @ symbol: 'sales > @sales_filter'.

Now that we’ve figured out how to retrieve data from the data frame, let’s move on to counting aggregate metrics: Number of Orders, Total Sales, Average Check, and Conversion Rate.

Calculating Metrics

Let’s calculate how much money the store made with each delivery class. Let’s start with a simple calculation: we add up the income from all orders. To do this, use the .sum() method.

Let’s add a shipping class. Before summing, group the data using the .groupby() method.

3.514284e+05 — scientific format of number output. This means 3.51 * 105. Since we don’t need such precision, we can instruct Pandas to round the values to hundredths.

Much better! Now we see the amount of revenue for each delivery class. We can’t tell from the total revenue whether things are getting better or worse. Let’s add a breakdown by order date.

You can see that the revenue jumps from day to day: sometimes it’s $10, sometimes it’s $378. I wonder if it’s the number of orders or the average check that changes. Let’s add the number of orders to the sample. To do this, instead of .sum(), we use the .agg() method, passing in a list of the names of the required functions.

Wow, it turns out that’s how the average check jumps. I wonder what the best day was? To find out, we sort the resulting dataframe: We display the top 10 days by revenue.

The command has become very large and doesn’t read well. To simplify it, you can split it into several lines. We put a backslash \ at the end of each line.

On its most successful day, March 18, 2014, the store made $27,000 with a standard shipping class. I wonder where the customers who placed those orders came from? To find out, you need to combine the order data with the customer data.

Joining Dataframes

So far, we have only looked at the table of orders. However, we also have data about the customers of the online store. Let us load them into the customers variable and see what they are.

We know the type of customer, their location, their name and the name of their contact person. Each customer has a unique customer ID. The same number is in the customer_id column of the orders table. We can find out which orders each customer has placed by using these columns. For example, let us look at the orders placed by the user CG-12520.

Let us return to the task from the previous section: find out which customers placed orders with standard shipping on March 18. To do this, we merge the Customers and Orders tables. Dataframes are joined using the .concat(), .merge(), and .join() methods. They all do the same thing, but differ in syntax — in practice it is enough to know how to use one of them.

Let me show you an example of .merge().

In .merge(), I first specified the names of the dataframes I wanted to join. Then I specified exactly how they should be merged and which columns should be used as keys.

The key is a column that connects the two data frames. In our case, it is the customer ID. In the table with orders it is in the column customer_id, and in the table with customers it is in the index. So in the command we write: left_on='customer_id', right_index=True.

Solving the Problem

Let’s consolidate what we’ve learned by solving a problem. Let’s find the 5 cities that generated the most revenue in 2016.

First, let’s filter the orders from 2016

The city is an attribute of users, not orders. Let’s add user information.

Now we need to group the resulting dataframe by city and to calculate the revenue.

Next: sort by descending order of sales and keep the top 5:

Done!

Try it yourself:

Take the order and customer data and try solving the following problems:

  1. How many orders were shipped first class in the last 5 years?
  2. How many customers from California are in the database?
  3. How many orders have they placed?
  4. Create a dataframe with the average checks for all states for each year.

That’s it for today! See you again soon!

✉️ Subscribe

Every few weeks I send a summary of the content I have created and interesting things I have found online. of their release.

A Hunter-Gatherer’s Guide to the 21st Century

Book Outline

This book is an attempt to identify, study and solve large-scale problems of our time thorough the lens of evolution: what should we eat, how should we sleep, how to care about our health, how to build relationships, how to raise children, how to build communities.

The main problem authors have identified is that despite the fact that humans are very good at adapting to changing environment, modern rate of change is too fast. The hyper-novelty leads to the fact that many people begin to feel confused, anxious and lost. It also leads to many problems we face as a society and a species. Authors try to approach identified problems through thinking from the first principles to filter out incorrect and irrelevant information and avoid the naturalistic fallacy.

Chapters Summaries

1. The Human Niche

In the first chapter «The Human Niche» authors introduce readers to individual and collective consciousness, lineage, epigenetic adaptation and the Omega Principle.

Humans don’t have niche in nature. As as species we are generalists that can adapt to any habitat. The solution to the paradox is that even as a species we are generalists, individually we are often specialists. But in addition to individual consciousness we have also evolved a shared collective consciousness. It allows us exchange ideas and learn with minimal friction.

As a learning of the individual always goes from conscious to unconscious, so does a collective learning. On an individual level unconscious learning is called skill or intuition, on a collective level — tradition and culture.

Culture is an epigenetic regulator that is evolved to serve the genome yet is more flexible and can adapt more rapidly.

2. A Brief History of the Human Nature

In the second chapter, the authors give an overview of our big evolutionary history from the beginning of life and the emergence of the first single-celled organisms through the development of animals, fish, reptiles, animals, primates, and finally humans. The authors show the time frame and origin of our features: bone structure, brain, heart, limbs and behavior.

The main point, in my opinion, is that being Homo Sapiens does not mean that we are no longer apes, primates, reptiles, fish, animals, vertebrates, and eukaryotes. We belong to each of these groups, and our physical and behavioral characteristics are the result of that. Humans are fish.

3. Ancient Bodies, Modern Worlds

The third chapter is named «Ancient Bodies, Modern Worlds». In the beginning of the chapter authors establish that evolutionary adaptations may work both through genetic and cultural means and that «Nature vs Nurture» dichotomy is false. Then authors purpose a test to determine if a trait is an evolutionary adaptation: it’s complex, it’s costly and it’s old.

It’s often the case that the purpose of a trait is currently unknown, like in the case of the appendix. Still unknown purpose should not motivate us to mess with or get rid of the trait as if it is an adaptation, it most likely has some hidden value that we don’t see due to our ignorance (Chesterton’s Fence).

An important feature of the evolutionary development is the presence of trade-offs. There are 2 types of trade-offs: allocation and design constraints. Both are ubiquitous. Humans found ways to evade some of the trade-offs by building outside of ourselves and combining specialisations. Yet not all trade-offs can be evaded.

👷 Work in Progress. I’m still writing the summaries of other chapters. Signup if you want to be notified of their release.