Quantum Bayesian Networks

November 30, 2020

My Free Open Source Book “Bayesuvius” on Bayesian Networks and Causal Inference

Filed under: Uncategorized — rrtucci @ 3:08 pm

THIS BOOK IS CONTINUOUSLY BEING IMPROVED AND EXPANDED. MAKE SURE YOU HAVE THE LATEST VERSION FROM GITHUB FOR MAXIMUM SATISFACTION.

See also my software “JudeasRx” that implements many ideas in causal inference https://github.com/rrtucci/JudeasRx

See also “Famous uses of Bayesian Networks

June 27, 2020

My Pinned Tweet at Twitter

Filed under: Uncategorized — rrtucci @ 9:28 pm

This is the pinned Tweet on my company’s (www.ar-tiste.xyz) Twitter account

March 23, 2023

How can we define “understanding” for LLMs?

Filed under: Uncategorized — rrtucci @ 9:33 pm

This question has been asked several times on Twitter. Here is my answer to it.

Judea Pearl has given a very explicit and testable answer to this question. Currently, LLMs cannot practice the Scientific Method (SM) in a **deliberate** way (as opposed to a trial and error search), whereas humans can. RL alignment is a trial and error (T&E) procedure, like the cartoon shown below.  This is not how humans normally practice the SM (sometimes they do practice it that way, but not very often, because T&E is a highly inefficient process). The SM looks for causes, not correlations. Thus, it is closely linked to causal inference (CI). By being able to practice SM in a deliberate way, I mean: (1) having an EXPLICIT engine that can do Pearl’s 3 rungs of CI, and (2) having a DAG atlas. Once LLMs achieve these 2 goals, they will be much closer to achieving real understanding.

(For a more detail explanation, see this earlier blog post).

harris-e-mc-2

March 20, 2023

How to survive as an AI startup? Causal Inference

Filed under: Uncategorized — rrtucci @ 10:33 pm

The total number of AI/ML startups listed in Crunchbase is a whopping 8,683, as of March 20, 2023. Because of the unveiling of ChatGPT on Nov. 2023, and the AI war currently being waged between Google, Microsoft/OpenAI and Meta, AI startup creation is sure to accelerate in 2023.

Only a handful of these startups will survive much longer than 5 years.  For your AI startup to survive more than 5 years, it will have to stand out from the pack. A great way to stand out is to add Causal Inference  (CI) products to your product line. Very few startups are currently doing this, despite the fact that this can make your AI startup look much more attractive to clients and investors. Many pundits think that the next giant step for LLMs will be to add to them the capability to discover and remember causal world models and the ability to perform causal inference and the scientific method.

That is where http://www.artiste.xyz and its software SCuMpy come in. SCuMpy is free, open source Python software for doing causal inference. SCuMpy is easy to use and understand and very powerful. It can be used as a pedagogical tool, a research tool, or a service tool. Contact us at http://www.ar-tiste.com if you want to partner with us to integrate SCuMpy into your product line.

ai-cant-reason-why-wsj-2018

Economist did DID first.

Filed under: Uncategorized — rrtucci @ 1:35 am

Before Twitter finally crumbles, let me preserve here a few joke Tweets I recently made about Economists and DID.

Want to hear a funny joke? The Wikipedia article on Difference-in-Differences (DID) does not cite or even mention the work of John Snow, who invented DID. It only cites the work of economists. Economists truly think they own causal inference. https://en.wikipedia.org/wiki/Difference_in_differences

Another lunacy of economists: Economists have written and continue to write a ton of papers on the difference in differences (DID) method, something that can be explained in exhaustive detail in a few lines. When you use DID, you pray that there are no confounders.

DID should be taught in 7th grade, or earlier. The Newtonian equation for DID is
y2-y1-(y’2-y’1)
The Hawking-Venezziano-Witten String theory equation for DID is
y2-y1-(y’2-y’1)
I also asked ChatGPT to tell me some jokes about economists (My favorite in red)
  1. Why did the economist cross the road? To get to the other market.
  2. Why did the economist go to art school? To learn the principles of supply and demand.
  3. How many economists does it take to change a light bulb? None, the market will take care of it.
  4. Why was the economist unhappy at the beach? He couldn’t find a perfectly efficient market.
  5. What did the economist say when he won the Nobel Prize? “This proves that if you can’t find a job, you can always invent one.”
  6. Why did the economist refuse to go on a roller coaster? He didn’t want to experience the ups and downs of the business cycle.
  7. Why did the economist bring a ladder to the party? He heard the drinks were on the house.
  8. Why did the economist break up with his girlfriend? He claimed she had diminishing marginal utility.
  9. Why did the economist refuse to pay for parking? He believed in free market solutions.
  10. Why did the economist become a magician? He wanted to turn interest rates into disappearing acts.
  11. How do you confuse an economist? Ask them to explain their theory in simple terms.
  12. Why do economists prefer cats to dogs? Because cats are independent and self-sufficient, just like a perfectly competitive market.

March 13, 2023

Joined Mastodon Today. Liking it a lot.

Filed under: Uncategorized — rrtucci @ 2:56 am

I’m sick and tired of Elon Musk censoring (shadow banning) my posts on Twitter. I joined Mastodon today. From now on, I will cross-post on Twitter ( https://twitter.com/artistexyz ) and on Mastodon (https://mastodon.world/@rrtucci). So far, I am liking Mastodon a lot. No shadow banning.

March 11, 2023

“Gaussian Copulas” blamed for bringing down Wall Street in 2008.

Filed under: Uncategorized — rrtucci @ 11:15 pm

I just finished a new chapter entitled “Copula”, for my free, open source book Bayesuvius.

Assume x_1 and x_2 are real numbers and x=(x_1, x_2). Suppose you don’t know a joint distribution P(x), but you do know its marginals P(x_1), P(x_2). There are infinitely many possible joint distributions P(x) with those marginals. For example, P(x)=P(x_1)P(x_2). A copula for this problem is one of those joint distributions, a nice smooth one, usually a Gaussian. All this can of course be generalized to x=(x_1, x_2, \ldots x_n)

While writing my chapter, I came across a Wired article that blames the 2008 financial meltdown in large measure to over-reliance on Gaussian copulas.

WIRED MAGAZINE: 17.03
Recipe for Disaster: The Formula That Killed Wall Street
By Felix Salmon, 02.23.09

It appears that David. X Li,  a quant with a Ph.D. in Economics, led most of Wall Street astray with his introduction of “Gaussian copulas” to the world of high finance. (Never trust an economist). From what I can tell, the problem arose from modeling too many probability distributions by Gaussians, and from predicting future risk based only on correlations, rather than on causation. Twas assuming that correlation implies causation that got the Wall Street beast in the end.

king-kong-quote

My first coding test for ChatGPT: plot bivariate normal distribution in Python. Gets a grade of A++

Filed under: Uncategorized — rrtucci @ 8:32 pm

normal-dist-python

I tried this code and it works perfectly out of the box. Below is a sample plot. I am very impressed. Even if it just cut & pasted from the code of somebody who has done this before, ChatGPT understood exactly what I wanted, and found it in less than a second. normal-dist-plot

March 5, 2023

SCuMpy is ready to rumba. My software SCuMpy can now be trained with time series (a.k.a. panel data)

Filed under: Uncategorized — rrtucci @ 12:23 am

The purpose of this blog post is to announce that my free, open source, Python software “SCuMpy” for doing causal inference (CI), can now be trained with time series (a.k.a. panel data). Here is a Jupyter notebook showing how easy it is to train SCuMpy.

This is more general than the G-formula model used in epidemiology, because in SCuMpy, the structure of the dynamic DAG is arbitrary; it isn’t, as in the G-formula model, fixed/rigged to be the same for all problems.

SCuMpy can only handle linear SCM. Non-linear SCM and Causal Bayesian Networks are outside its purview. However, I always recommend that a CI problem be solved first in the linear SCM regime, where it can be solved quickly, exactly, and in closed form. This is often enough to grok the causal physics that is going on. Subsequently, to achieve higher precision, one can generalize the solution to the  more thorny, non-linear SCM and causal Bayesian Network regimes. “Linear case first” is very much the strategy followed in Control Theory. In tackling problems that exhibit feedback, as is the case in this blog post, the connection between Causal Inference and Control Theory runs very deep. DAGs with feedback and the recurrent DAGs used to describe time series, are two different graphical representations of the same physical phenomenon.

(Previous blog posts about SCuMpy)

I end this blog post with a famous quote that is clearly about causal feedback.

kennedy-causality

February 28, 2023

John F. Kennedy Causality

Filed under: Uncategorized — rrtucci @ 6:12 pm

kennedy-causality

This is a nice example of feedback loops in Causal Inference (CI). My software SCuMpy can now handle feedback loops in linear SCM. This is useful for applying CI to time-series (a.k.a., panel data)

Seagull Causation

Filed under: Uncategorized — rrtucci @ 5:56 pm

seagull-causation

February 19, 2023

SCuMpy can now do linear SCM with feedback loops

Filed under: Uncategorized — rrtucci @ 8:50 pm

feedback-pot-out

The above picture represents the Potential Outcomes DAG (in black) with all possible feedback arrows (in dashed green). It was drawn by my software SCuMpy using GraphViz.

Exciting news for me. After 2 weeks of intense work, I have derived exact, closed-form formulae for arbitrary linear SCM with feedback loops, and I have programmed those formulae into my software SCuMpy.

The derivations of the formulae can be found in my book Bayesuvius, under the chapter entitled “LDEN with feedback loops”.

Jupyter notebooks illustrating how to use SCuMpy to run these formulae for you, for a graph of your choice, can be found here:

Linear SCM with feedback loops can be “unrolled” into a “time-slice” that repeats itself, with adjacent time-slices connected by feedback arrows. A 2 time-slice, unrolled version of the above graph is shown below. (Again, this figure was drawn by SCuMpy via GraphViz). This theory and software can be used to do Causal Inference with time series (a.k.a. panel data).

potential-outcomes-2time-slices

February 1, 2023

SCuMpy can now estimate arrow gains (a.k.a. path coefficients) from a dataset of node values

Filed under: Uncategorized — rrtucci @ 7:04 pm

far-side-can-they-do-that

If you’ve ever tried to use linear SCM in practice, you might have wondered how to calculate the arrow gains (a.k.a. path coefficients or arrow fire) from the data. The relevant data can be presented as a  Pandas or R dataframe whose columns are labeled by the names of the nodes of the SCM, and whose rows give an instance of the values of the nodes.

This brief blog post is to announce that SCuMpy can now estimate the arrow gains from data. Here is a SCuMpy Jupyter notebook describing how to do this.

January 29, 2023

Automated Pearl Identifiability via SCuMpy

Filed under: Uncategorized — rrtucci @ 11:23 pm

cowboy-mystery

Can you identify this? (I reveal the answer in the comments to this blog post). Causal inference is hard, and CI identifiability is doubly hard!

Check out the newest jupyter notebook at the github SCuMpy repo.

Here is a quote from the notebook’s header:

This notebook analyses a linear SCM that comes from a lecture by Brady Neal, who attributes it to a Tian & Pearl 2002 paper, and says it’s an example of the “unconfounded children” criterion. This SCM does not satisfy either the backdoor or frontdoor criteria, but the query P(y|do(x)) is still known to be identifiable for this SCM.

The most general way to decide whether a do query for a particular DAG is identifiable, is by using Pearl’s Do Calculus rules. However, those rules are fairly complicated and therefore hard to automate.

We contend that by analyzing any DAG **symbolically**, in the linear regime, using SCuMpy, one can decide rigorously whether a do query for that DAG is identifiable or not. Hence, ScuMpy allows us, *if we have a single specific DAG in mind*, to bypass and supplant, in an automated fashion, the Do Calculus rules!

As shown in the example below, all we have to do to prove identifiability of the query P(y|do(x)), is to show that, after we amputate all arrows entering \underline{x}, the covariance \langle \underline{x}, \underline{y}\rangle becomes independent of the hidden (unobserved) variables. It doesn’t get simpler than that!

January 26, 2023

First Version of SCuMpy released and how to install it for python beginners

Filed under: Uncategorized — rrtucci @ 3:06 am

pond-scum

The purpose of this blog post is 2 fold.

First, I would like to announce that the first version of SCuMpy is now available to the public at github. (I announced my plans to write SCuMpy in a previous blog post).  This 1st version works surprisingly well, and is very instructive.

Second, I wanted to point out that SCuMpy is a great pedagogical tool. Today, I had a Zoom conversation with a professor that wants to use it in his Causal Inference course to teach about SCM. He has asked me, and I’ve accepted, to give a short Zoom introduction to SCuMpy for his students. I can’t assume that his students have ever used Python. So I have to teach them how to install both Python and SCuMpy. Here is a quick sketch on how to do this. This is not the quickest or easiest way of doing it, but it avoids complications like using git or getting a github account. You can learn those things later.

These instructions work for all 3 major platforms: Linux, Mac, and Windows

  1. install miniconda from
    https://docs.conda.io/en/latest/miniconda.html
  2. install pycharm community edition from
    http://www.jetbrains.com/pycharm/download/
  3. open pycharm and go to file>new project>
    location of project: you decide, but replace “pythonProject” by “my_scumpy”
    new environment: conda
    python version: 3.10
  4. go to file>settings>project my_scumpy> python interpreter and add (+) the following packages: (Make sure the Conda package manager button is off or it won’t allow you to add anything.)
    numpy
    matplotlib
    ipython
    jupyter
    jupyterlab
    networkx
    pandas
    sympy
    pillow, graphviz, pydotplus, python-graphviz, pydot
  5. Go to this github URL: https://github.com/rrtucci/scumpyand press the green “Code” button. Then press “Download ZIP”. You must now unpack the ZIP file that you receive, and move the files inside the unpacked folder to your “my_scumpy” folder. I will assume that you know how to do this.
  6. open a terminal at bottom of the project’s main window. Type the following line and Press the ENTER key
    jupyter lab
    A browser window will open. Go to the “jupyter_notebooks” folder, and click on any of the notebooks there.
  7. If you want to run SCuMpy for a DAG of your own devising, go to the “dot_atlas” file, and insert a new dot file called “my_first.dot” there. You can test dot_files online at the following URL:
    https://dreampuf.github.io/GraphvizOnline/
    Be forewarned that SCuMpy only admits a fairly simple kind of dot file (the type of dot file it admits is described in the docstrings of the DotTool.py file). Once you have inserted a proper dot file named “my_first.dot” into the “dot_atlas” folder,
    clone any of the jupyter notebooks, and replace, in the clone, the name of the old dot file by the new name “my_first.dot”
  8. If you are new to Python and want suggestions on books to learn it from, I learned it from the book entitled “Introducing Python” by Lubanovic. I was quite pleased with that book, but be aware that there are dozens of other introductory Python books, and you might have a very different taste for books than I do. Note also that most Python programmers use https://stackoverflow.com and google searches when they are stuck or don’t remember something, which happens quite often.

January 18, 2023

My new Python App, SCuMpy

Filed under: Uncategorized — rrtucci @ 11:08 pm

pond-scumPond SCM

This is exciting news for me!! I’ve started to write a new Python app called SCuMpy. The name is intended to (1) evoke the name of the amazing Python library NumPy, and (2) be a portmanteau of SCM (Structural Causal Model) and python.

A linear SCM is a DAG whose arrows are labelled by path coefficients (a.k.a. gains). If you want to learn more about linear SCM, explained using my notational conventions which will also be SCuMpy’s notational conventions, check out my free, open source book Bayesuvius. Look in the chapter entitled “Linear Deterministic Bnets with External Noise”.

SCuMpy will be a library of methods for dealing with linear SCM. For now, I am working on adding to SCuMpy the following 6 capabilities:

  1. Find a symbolic expression (using the symbolic manipulator SymPy) for covariance(gains). By covariance(gains), I mean a formula expressing cov(X, Y) (i.e., the covariance between any two nodes X and Y of the SCM) in terms of the gains of the SCM. If you want examples of what such covariance(gains) expressions look like, see the paper entitled “A crash course on good and bad controls”. That paper by Pearl et al is very well known in the CI (Causal Inference) community. The paper has a remarkable appendix that gives covariance(gains) expressions for 18 models. SCuMpy will be able to reproduce them all, symbolically, in an automated fashion.
  2. Find a symbolic expression (using SymPy) for gain(covariances). By gain(covariances), I mean a formula expressing the gain of any arrow in the SCM, in terms of the covariances cov(X, Y), where X and Y are any two nodes of the SCM.
  3. Methods for translating to LaTex, the symbolic expressions found in 1 and 2.
  4. Methods for evaluating numerically the symbolic expressions found in 1 and 2.
  5. Methods for deciding identifiability of a do query based on the expressions found in 1 and 2.
  6. Given a dataset, calculate the empirical gains for all arrows of the SCM.

January 14, 2023

Robot Envy or the Gary Marcus Syndrome

Filed under: Uncategorized — rrtucci @ 5:07 pm

chatgpt-icon

I asked ChatGPT the following question:

Write me an essay on Gary Marcus, the first documented patient of the rare psychological disease called Robot Envy or the Gary Marcus syndrome.

This was her reply:

Gary Marcus, a renowned cognitive scientist and neuroscientist, was the first documented patient of a rare psychological disorder called “Robot Envy” or the “Gary Marcus Syndrome.” The disorder is characterized by an overwhelming desire to become a robot or an artificial intelligence (AI) entity.

The origins of the disorder are not well understood, but it is believed to stem from a combination of genetic predisposition and environmental factors. Marcus, who has a long-standing interest in AI and robotics, may have developed the disorder as a result of his intense exposure to these fields.

Symptoms of the disorder include a strong desire to possess robotic or AI-like qualities, such as enhanced intelligence and physical abilities, and an obsession with technology. Patients may also experience feelings of inadequacy and inferiority when compared to robots or AI entities.

The condition can be debilitating, impacting an individual’s ability to function in their daily life. Treatment options for the disorder are limited, and currently, there are no known cures. However, therapy, medication, and support groups can help to manage the symptoms.

Despite the challenges that come with the disorder, Marcus has not let it hold him back from his work. He has continued to make significant contributions to the field of AI and has even written a book on the topic, “The Future of AI: An optimistic view.”

In conclusion, while Robot Envy or the Gary Marcus Syndrome is a rare and little-understood disorder, it serves as a reminder of the impact that technology can have on our lives and how it can shape our desires and beliefs. We should continue to study and understand the condition, to find ways to help those affected by it and to appreciate the benefits and limitations of technology in our lives.

Next Page »

Blog at WordPress.com.

%d bloggers like this: