Sergii Kharagorggiiev: projects & blog

Reflections on 2023

2024-07-01T00:00:00Z

It has not been a very good year. I personally can’t complain about anything. But the war in Ukraine continues, and the war in Israel has started.

Yes, a single person can't do much about it, but at the very least we can:

vote for sane and competent people
be well informed, not fall for propaganda, fight disinformation
donate to the right causes

Let’s start the year with good and charitable deeds.
https://standforukraine.com/
https://koloua.com/en/
https://give.oxfamamerica.org/page/29636/donate/1
https://www.savethechildren.org/us/where-we-work/west-bank-gaza

blog

This year I’ve worked on a bunch of side projects, and posted more on this blog. My posts were featured in hackaday and weekly robotics, which I’m quite happy about. The page views have increased from 20/month to 500/month, Not much, but, small steps.

work

I work as Starship Technologies, where we use autonomous robots for local deliveries. Our robots are doing great! This year they have completed a million autonomous deliveries (and 5M in total since launch), and the service has been expanded to dozens of new sites in several countries. Looking forward to continued expansion in 2024.

toy of the year

Hands down, it’s the 1m high construction crane:

films

Nothing too memorable this year, except for animation.

Blue Giant
This is a great film about jazz, passion and talent.
Did I like the movie just because of the music? Not entirely, the characters and the plot are lovely as well. Still, I liked the sountrack by Hiromi almost as much as the film itself. It’s not surprising – I do listen to her music a lot, and have been to her live shows twice.

The Boy and the Heron
The new Hayao Miyazaki masterpiece.
The first half of the film is just great. As for the second half – somehow I felt like there is just too much of everything in it, too many settings, creatures, events, mysteries… First I thought that the plot and the editing are too messy. But now I think that maybe it was intentional, and I just did not understand it fully in the first viewing?
In any case I’ll need to watch it again – it deserves several viewings just as any other movie by Miyazaki.

books

“Outlive: The Science and Art of Longevity” by Peter Attia

A book about health and longevity, written from a very solid science-backed standpoint.

First of all, it made me reevaluate my attitude towards exercise. Before, I thought that doing a bare minimum is all right, as long as I'm fine with my body shape. But the author makes an evidence-based point that being extremely fit and strong has a huge impact on long-term health and lifespan. Genetics is important, but apparently it's possible to "cheat" and have similar effect from being in top shape.

I also liked that the author does not give any prescriptions when there is no data, for example he does not give much advice for diet/nutrition, apart from caring about macro-nutrients balance, just because most studies on nutrition are so conflicting and uncertain.

And the chapter about mental health is just great. It's so obvious – why even bother about health and longevity, before making sure one would be able to really enjoy life? Any book on healthy lifestyle should have such chapter. 5/5.

"The Undoing Project: A Friendship That Changed Our Minds" by Michael Lewis

It's a nonfiction story about a pair of influential psychologists who had shown how surprisingly irrational people are. it's also intertwined with the story of their relationship, bunch of other interesting characters, historical events, scientific paradoxes... All very well written. Highly recommended. 4/5.

“Mr. Penumbra's 24-Hour Bookstore” by Robin Sloan

Nice and entertaining read, kind of like “Ready Player One” but for bibliophiles and typography nerds? It’s filled with random trivia, and has the same dynamic of an unremarkable protagonist dragging a group of friends into a mystery hunt.

I guess the book’s main problem is that it can’t decide how serious or goofy it wants to be. 90% of the novel is great, but 10% is total cringe. Descriptions of the tech scene are especially bad, not sure if it’s a creative choice or author’s ignorance. But, a really good mystery plot kind of compensates for all the shortcomings. 3/5.

”The Kaiju Preservation Society”, “Starter Villain” by John Scalzi

It’s a nice, light and mostly entertaining reading. The main problem of this books is that they are just like your average mediocre Hollywood blockbuster – full of tropes and stereotypical characters, just without the good parts (block-busting VFX). There is too much witty banter. Same jokes are repeated too many times. I guess I’m just not a part of the target audience?

I ended up reading the first half with some excitement, then getting increasingly bored, and finishing the books only because of curiosity about the plot. Which is an experience that I’ve also had with most of the recent Hollywood movies. 2/5.

“System Collapse, The Murderbot Diaries Book 7"“ by Martha Wells

The problem of this book is that it spends too much time on the character drama. I expected more of an action-packed thriller, like the previous books. Oh well, the first book (“All Systems Red”) is still great. 2/5.

”Witch King” by Martha Wells

This book has two storylines. Every other chapter jumps to the different story, which I hated. It’s like there is a cliffhanger after each chapter. 3/5 for one of the storylines, 2/5 overall.

SF paperbacks

Time from time I buy old SF books and magazines from a local bookstore. Old SF stories are a great entertaining reading, with many of them being surprisingly relevant and thought-provoking.

Task vectors & analogy making in LLMs

2024-01-05T00:00:00Z

I have described the problem of analogy-making interpretability in the previous post: given the examples of transformed sequences of numbers, what’s the mechanism behind figuring this transformation out, and applying it correctly to the incomplete (test) sequence?

prompt: "0 1 2 to 2 1 0, 1 2 3 to 3 2 1, 4 5 6 to ", output: “6 5 4”

It was easy to check on which layer the correct answer appears, but tracing the sources of that answer to earlier layers turned out to be challenging.

meaningful intermediate embeddings?

When I applied logit lens [1] to the output of attention blocks, for the prompt that contained reversed sequences of numbers, I have noticed that the output contained “reverse” token (on last token position of layer 15).

I’m using a llama.cpp-based app (described in the previous post) for showing the logit lens output. Each row corresponds to a token position, and lists top 5 tokens sorted by the logit score:

./mia -m llama2.gguf --prompt "0 1 2 to 2 1 0, 1 2 3 to 3 2 1, 4 5 6 to" -n 5 --logit-lens kqv_out 5

Layer #15 kqv_out-15:
0: дар 0.35|oure 0.35|kar 0.33| Según 0.33|aki 0.3|
1:  dust 0.36|textt 0.36|elde 0.35|azzo 0.34| retro 0.34|
2: 典 0.37| Censo 0.35|oure 0.35| Aires 0.35| pó 0.34|
3: ḷ 0.39|ket 0.39| estaven 0.39|öß 0.39|oure 0.38|
4: zerw 0.62| estaven 0.51|cita 0.5| alberga 0.49|łow 0.48|
[...]
16: shal 0.84|ket 0.73|Assert 0.72|ając 0.66|sono 0.66|
17: ipt 0.95|кта 0.88|inal 0.86| inform 0.85| advanced 0.85|
18: кта 0.85|minipage 0.83| Mean 0.77|Assert 0.75| meaning 0.74|
19: ipt 0.78|Duration 0.76|zug 0.75|gemeinde 0.75|mannschaft 0.72|
20: shal 0.64|agy 0.64|prev 0.62| SA 0.6| Gay 0.58|
21:  revers 0.75| reverse 0.68|mat 0.67|shal 0.66|vat 0.66|

Although the task is about reversing, “reverse” is not mentioned explicitly anywhere in either input or output.

I have tried subtracting the embedding of the “reverse” token, effectively removing it, to check it is a part of the analogy-making mechanism. It did not affect the output, which means that this token is not critical for the generation, and might be just a side effect of some other mechanism.

task vectors

While looking for more information, I have found that this topic is covered by two recent papers on so-called “task vectors” [1, 2]. These papers concurrently explore similar ideas, just approach the problem differently, complementing each other.

So, what’s a task vector? Suppose that the prompt is a set of examples of some transformation. For instance, pairs of English and French words, implying a translation task:

“bonjour -> hello, one -> un”

We expect that, by analogy, the model will complete next English words with French translations as well. In case of the prompt that has an incomplete test query in addition to training examples:

“bonjour -> hello, one -> un, yes -> “, output should be “oui”.

The idea behind “task vectors” is that in this case the model is working in two stages.

First, in the the earlier layers, the model creates an abstracted and compressed representation of the “translate into French” task, based on several train examples. This task description (task vector) is stored in the embedding space. Then, next layers use this task vector as guidance for what transformation to use for next relevant completions.

What follows is that these two stages can be split and applied separately. It’s possible to extract the task vector and use it instead of the train examples to get correct test predictions.

In [1] extraction and application of a task vector is as simple as copying and pasting an embedding vector for a specific token. In [2] the methodology is more complicated but the effect is similar in both cases.

Another interesting point is the observation in [1] that task vectors contain tokens that describe the tasks:

In multiple cases, we observe tokens that directly describe the task. Importantly, these terms never explicitly appeared in the context. For example in the task of translation from French to English, we observe tokens such as “English” and “translate”. This supports our view that θ carries significant, non-trivial semantic information about the task [1]

This explains finding the “reverse” token embedding in my experiments with the logit lens.

examples of task vectors

For an example of using a task vector, let’s look at the prompt: “France ->”. Without any interventions, Llama2 output is “Italy” (again, using llama.cpp-based application):

./mia --model llama2.gguf --prompt "France ->"
output: " Italy"

Let’s use a task vector to modify the model’s state so that instead, it would output a capital city for a given country: “France -> Paris”.

First, to create the task vector, we need several examples of the target transformation (country -> capital):

Egypt -> Cairo, Norway -> Oslo, Estonia ->"

With these examples as the input, we can save the vector that corresponds to the last token, from the model’s residual stream on layer #14^[1]:

./mia --model llama2.gguf --prompt "Egypt -> Cairo, Norway -> Oslo, Estonia ->" --save l_out-14 ~/tmp/l_out-14
output: " Tallinn"

And finally to apply the task vector to the initial prompt, need to patch the last token’s vector with the saved vector, on the same layer #14.

./mia --model llama2.gguf --prompt "France ->" --patch l_out-14 ~/tmp/l_out-14 --from-token-idx 13 --to-token-idx 2
output: " Paris"

It worked, the output was successfully modified, and the task vector patching induced the generation of the country’s capital.

task vectors for copycat-like problems

Does it work for the copycat-like problems from the previous post? For example :

prompt: "0 1 2 to 2 1 0, 1 2 3 to 3 2 1, 4 5 6 to ", output: “6 5 4”

In this case, based on several train examples, the model reverses the last incomplete example, by analogy. Let’s check if reversal can be induced using a task vector instead of the examples.

Without any examples, the output is a straightforward continuation of the sequence:

./mia --model llama2.gguf --prompt "4 5 6 to 6 "
output: "7 8"

Creating a task vector based on several examples:

./mia --model llama2.gguf --prompt "1 2 3 to 3 2 1, 5 6 7 to 7 " --save l_out-14 ~/tmp/l_out-14
output: "6 5"

And, apply the task vector:

./mia --model llama2.gguf --prompt "4 5 6 to 6 " --patch l_out-14 ~/tmp/l_out-14 --from-token-idx 24 --to-token-idx 10
output: "5 4"

Works correctly for this example as well.

interpretability of the task vectors

The next question is – what are the mechanisms behind the task vectors? Both parts: creation based on train examples, and application to novel test examples.

In [1], authors stop at analyzing the effects of task vector application, while [2] goes further, finding the set of attention heads that strongly affect the task vector. However, there is still no understanding of the sub-circuits and specific computational structures.

There are many other open questions, for example:

are task vectors created for any and all transformations encountered in the inputs, regardless of the number of examples?
there are many categories of transformations: sequence continuation, semantic similarity, etc… Are there shared mechanisms between task vectors of different types?
how do the task vectors emerge, and are they similar across LLMs?

references

interpreting GPT: the logit lens
R. Hendel, M. Geva, A. Globerson, In-Context Learning Creates Task Vectors. arXiv, doi: 10.48550/arxiv.2310.15916 (2023). https://arxiv.org/abs/2310.15916
E. Todd, M. L. Li, A. S. Sharma, A. Mueller, B. C. Wallace, D. Bau, Function Vectors in Large Language Models | Papers With Code (2023). https://paperswithcode.com/paper/function-vectors-in-large-language-models.

Layer #14 is selected based on experimental data from [1], and on testing several options for the selected examples ↩︎

Llama.MIA — fork of Llama.cpp with interpretability features

2023-12-28T00:00:00Z

I have been using llama.cpp for learning about transformers and experimenting with LLM visualizations and mechanistic interpretability.

Initially I’ve just inserted bunch of code all over the ggml compute framework. That code was not thread safe, used lots of hardcoded values, and communicated using global variables.

Now I have refactored it a bit, moving most of the code into hooks/callbacks. New version is called Llama.MIA, where MIA stands for “mechanistic interpretability application”. For now, only CPU version is supported, and it has been tested only with Llama2.

Next sections describe building, setup and usage.

Setup

# get the code, checkout the branch
git clone https://github.com/coolvision/llama.mia
cd llama.mia
git checkout mia

# build
make mia

# obtain the original LLaMA2 model weights and place them in ./models
ls ./models
ggml-vocab-llama.gguf  llama-2-7b  llama-2-7b-chat  llama.sh  tokenizer_checklist.chk  tokenizer.model

# install Python dependencies
python3 -m pip install -r requirements.txt

# convert to ggml FP16 format
python3 convert.py models/llama-2-7b-chat/

# quantize the model to 4-bits (using q4_0 method)
./quantize ./models/llama-2-7b-chat/ggml-model-f16.gguf ./models/llama-2-7b-chat/ggml-model-q4_0.gguf q4_0

# run the inference
ln -s ./models/llama-2-7b-chat/ggml-model-q4_0.gguf llama2.gguf
./mia -m llama2.gguf -n 128

Attention map visualization

--draw PATH, for example:

./mia -m llama2.gguf --prompt "William Shakespeare was born in the year" -n 5 --draw ~/tmp/llama_vis.png

Computation graph printout

--print-cgraph — first prints the details of all the tensors:

./mia -m llama2.gguf --print-cgraph

TYPE   OP              NDIMS NE0 NE1 NE2 NE3  NB0 NB1   NB2      NB3       DATA                                       NAME
q4_0   NONE            2 4096 4096 1 1        18  2304  9437184  9437184   0x7fb20e57ae40              blk.0.attn_k.weight
q4_0   NONE            2 4096 32000 1 1       18  2304  73728000 73728000  0x7fb202f9ce40                token_embd.weight
i32    NONE            1 1 1 1                1   4     4        4         0x7fb1ee52e020                       inp_tokens
f32    NONE            1 4096 1 1 1           4   16384 16384    16384     0x7fb21490ae40           blk.0.attn_norm.weight
[...]

Next, all the nodes in the computation graph

ARG    TYPE   OP                  NDIMS NE0 NE1 NE2 NE3     NB0 NB1    NB2    NB3                    DATA                   NAME
DST    f32    GET_ROWS            1 4096 1 1 1              4   16384  16384    16384      0x7fb1ee52e040               inp_embd
SRC    q4_0   NONE                2 4096 32000 1 1          18  2304   73728000 73728000   0x7fb202f9ce40      token_embd.weight
SRC    i32    NONE                1 1 1 1 1                 4   4      4        4          0x7fb1ee52e020             inp_tokens

DST    f32    RMS_NORM            1 4096 1 1 1              4   16384  16384    16384               (nil)                 norm-0
SRC    f32    GET_ROWS            1 4096 1 1 1              4   16384  16384    16384      0x7fb1ee52e040               inp_embd
[...]

Logit lens

Internally, transformers use representations that are encoded in the embedding space. To convert a transformer's output into meaningful tokens, it's multiplied with the unembedding matrix.

The idea of the logit lens, is to apply the same transformation to intermediate layers, allowing to interpret transformer's hidden internal state.

--ll TENSOR_NAME TOP_K, --logit-lens TENSOR_NAME TOP_K — prints TOP_K un-embedded tokens for a specified tensor. Partial matches of the tensor names work as well, so if TENSOR_NAME is “l_out” it will print logit lens results for all layers (“l_out-0”, “l_out-1”, …). “l_out-20” will print if only for layer 20.
It will work for the tensors of the residual stream (l_out), as well as for attention outputs (kqv_out), MLP outputs (ffn_out), and other tensors with the embedding dimensions.

./mia -m llama2.gguf --prompt "The capital of Japan is" -n 5 --ll l_out 8

unembed LN 0 l_out-0:
0:  Архив 1.3|bolds 1.3|archivi 1.1| konn 1.1|penas 1.1| partiellement 1.1|пута 1|embros 0.97|
1: пута 0.51| sier 0.42| Censo 0.38| Архив 0.37| virtuel 0.37|penas 0.36|Portail 0.36|férences 0.33|
2: férences 0.18| straight 0.18|empre 0.17| Censo 0.17| succ 0.17|寺 0.17| soft 0.17|csol 0.17|
3: пута 0.44| partiellement 0.33|archivi 0.29| sier 0.28| cí 0.28|Sito 0.28| konn 0.28|embros 0.25|
4: archivi 0.25|➖ 0.25|пута 0.25| returns 0.23|瀬 0.23|textt 0.22|Sito 0.21|ѐ 0.21|
5: 昌 0.17|ic 0.16| Tro 0.15| solution 0.15| first 0.14|icked 0.14| ic 0.14|opera 0.14|
6: пута 0.28|nt 0.26|archivi 0.24|embros 0.21| sier 0.21|Sito 0.21|阳 0.21| also 0.2|

[...]

unembed LN 18 l_out-18:
0:  Unterscheidung 1.1e+02| Hinweis 1.1e+02| nobody 1e+02| sierp 1e+02| everybody 1e+02| Einzeln 98| kwiet 97| Begriffe 95|
1:  6.9|penas 6.2| following 5.6|odor 5.5| článku 5| purpose 4.7|Ḩ 4.6| Following 4.6|
2: ization 7.5|ist 6.5| pun 6.4|isation 6| city 5.7|ized 5.7| letters 5.6|ists 5.2|
3:  France 4.5|flow 4.4| flows 3.9| Germany 3.8| Australia 3.7| United 3.6|imo 3.5| Italy 3.5|
4: ped 3.3|amb 3|ira 3|ade 2.9| conserv 2.8|ung 2.8|ew 2.8| pied 2.7|
5: ese 6.3|eses 4.7| Une 4|esen 3.9|imation 3.5|fen 3.5|amer 3.4| abbre 3.3|
6:  capital 6.3| Tokyo 5.3| Capital 4.9| capit 4.8| called 4.3| city 4.3| cities 4.2| Hinweis 3.9|

Attention head zero-ablation

Zeroing the output of an attention head is useful for verifying if it is responsible for a certain beahvoir. For example, "Mechanistically interpreting time in GPT-2 small"

-a INDEXES, --ablate INDEXES — zero ablate attention heads with indexes from a comma-separated list

./mia -m llama2.gguf --prompt "William Shakespeare was born in the year" -n 5 --ablate 0,1,2,3,4,5,6,7,8,48,49,50,60,175,180,190,200 --draw ~/tmp/llama_vis_a.png

Effect of the ablation can be inspected on the visualized attention maps.

-s LAYER INDEX, --select LAYER INDEX — for a specific layer, zero all attention heads except for one with specified index. For example, leave only one head on L16:

./mia -m llama2.gguf --prompt "William Shakespeare was born in the year" -n 5 --select 16 24 --draw ~/tmp/llama_vis_a.png

Saving tensors

--save NAME PATH — save any tensor to a file

./mia -m llama2.gguf --prompt "5 6 7 to 7 6 5, 2 3 4 to 4 " -n 3 --save l_out-14 ~/tmp/l_out-14-2

Loading (patching) tensors

Activation patching is useful for analyzig connections between components of a transformer. For example, see "How to Think About Activation Patching".

--patch NAME PATH PATCH_FROM_IDX PATCH_TO_IDX — patch model’s tensor NAME for a specific token index, with values from the file PATH. Number of tokens in loaded tensor might be different from current number of tokens.

--patch-avg NAME PATH1 PATH2 PATCH_FROM_IDX PATCH_TO_IDX — same as previous, but values are loaded from two files and averaged. Two input tensors must have the same dimensions.

Mechanistic interpretability of LLM analogy-making

2023-10-20T00:00:00Z

Can LLM make analogies? Yes, according to tests done by Melanie Mitchell a few years back, GPT-3 is quite decent at “Copycat” letter-string analogy-making problems. Copycat was invented by Douglas Hofstadter in the 80s, to be a very simple “microworld”, that would capture some key aspects of human analogy reasoning. An example of a Copycat problem:

”If the string abc changes to the string abd, what does the string pqr change to?“

Many more examples are collected on this page.

A project that I'm working on while studying mechanistic interpretability (MI), is applying MI to an LLM's ability to solve Copycat problems.

According to Douglas Hofstadter, analogy is the core of cognition [1], and it can be argued that it is a basis for various abstract reasoning abilities. There are other similar problem domains that require few-shot abstract reasoning — like inductive program synthesis, abstraction and reasoning challenge, etc… Still, Copycat is the simplest one, which makes it a good starting point for MI.

selecting a model & test prompts

It would be nice to use GPT-2 for the investigation, a lot of recent work in MI has been done based on it. But, alas, it does not seem to be able to solve analogy puzzles.

I have chosen Llama-7B-chat — it’s able to solve Copycat-like problems, and is small enough to be convenient for experimentation. It does not work as well as GPT-3.5 for Copycat and I had to tweak the problem formulation, but, eventually, I got it to solve simplified problems like:

prompt: "0 1 2 to 2 1 0, 1 2 3 to 3 2 1, 4 5 6 to ", output: “6 5 4”
prompt: "0 1 2 to 0 1 3, 1 2 3 to 1 2 4, 4 5 6 to ", outuput: “4 5 7”
prompt: “0 1 to 0 1 1, 1 2 to 1 2 2, 4 5 to “, output: “4 5 5”

Llama-7B has a fairly "standard" transformer architecture with 32 blocks and 32 attention heads in each block.

logit lens

I have started with applying logit lens [3] to the first test prompt:

"0 1 2 to 2 1 0, 1 2 3 to 3 2 1, 4 5 6 to "

Logit lens output for this prompt is shown below: top 5 tokens with probabilities, predicted for for the last token in the prompt, for all 32 blocks of Llama-7b-chat. It's supposed to predict "6", the first token of the correct solution:

0: пута 0,076 | penas 0,015 |  sier 0,011 | сылки 0,009 |  partiellement 0,009 | 
1:  rep 0,006 |  accomp 0,006 |  soft 0,005 |  regular 0,005 |  use 0,004 | 
2:  rep 0,016 |  accomp 0,010 |  M 0,007 | gex 0,004 |  use 0,004 | 
3: pec 0,021 | 间 0,009 |  gepublic 0,009 |  wat 0,007 | opp 0,007 | 
4: pec 0,039 | Пе 0,015 | ynt 0,006 |  util 0,006 |  voc 0,005 | 
5: pec 0,017 | ynt 0,014 | oro 0,006 | igt 0,006 |  mn 0,005 | 
6: oth 0,015 | conde 0,008 | arz 0,008 | ynt 0,008 | со 0,008 | 
7: со 0,015 | patch 0,007 |  lex 0,005 | oth 0,005 |  Mand 0,005 | 
8:  gate 0,020 |  Bru 0,009 | lea 0,007 | lear 0,007 | mers 0,006 | 
9: со 0,020 | 宿 0,009 | anim 0,008 | nelle 0,007 | ❯ 0,007 | 
10: iente 0,012 | ❯ 0,012 |  Pas 0,011 | ole 0,007 | lear 0,006 | 
11: ole 0,032 | iente 0,018 | ще 0,011 | reen 0,007 | colo 0,007 | 
12: ole 0,012 |  Glen 0,011 | pas 0,006 | sono 0,006 | lex 0,006 | 
13: vert 0,017 | 忠 0,012 | vice 0,012 |  Vert 0,008 | bage 0,007 | 
14:  mul 0,023 |  Mul 0,014 | sono 0,010 |  tie 0,008 | vice 0,006 | 
15: sono 0,019 |  Mul 0,014 |  Pas 0,011 | vice 0,008 |  tie 0,006 | 
16: sono 0,014 | tring 0,014 | 6 0,012 |  kwiet 0,008 | aug 0,007 | 
17: 6 0,744 |  six 0,115 |  Six 0,059 |  sixth 0,017 | 六 0,009 | 
18: 6 0,715 |  six 0,164 |  Six 0,049 |  sixth 0,009 | 六 0,003 | 
19: 6 0,852 |  six 0,097 |  Six 0,010 |  sixth 0,007 |  seis 0,003 | 
20: 6 0,920 |  six 0,034 |  Six 0,007 |  sixth 0,007 | 5 0,006 | 
21: 6 0,884 |  six 0,042 | 5 0,009 |  sixth 0,007 |  Six 0,006 | 
22: 6 0,843 |  six 0,037 | 5 0,014 |  sixth 0,008 |  Six 0,008 | 
23: 6 0,848 |  six 0,030 | 5 0,015 |  sixth 0,004 |  Six 0,003 | 
24: 6 0,837 | 5 0,024 |  six 0,014 | 3 0,005 |  sixth 0,003 | 
25: 6 0,932 |  six 0,029 |  sixth 0,006 |  Six 0,005 | 5 0,002 | 
26: 6 0,934 |  six 0,023 | 5 0,004 |  sixth 0,004 |  Six 0,003 | 
27: 6 0,956 |  six 0,013 | 5 0,007 |  sixth 0,002 | 3 0,002 | 
28: 6 0,980 | 5 0,009 | 3 0,003 |  six 0,002 | 2 0,001 | 
29: 6 0,982 | 5 0,012 | 3 0,001 |  six 0,001 | 2 0,000 | 
30: 6 0,985 | 5 0,013 | 3 0,001 | 2 0,000 | 7 0,000 | 
31: 6 0,960 | 5 0,029 | 3 0,005 | 7 0,002 | 2 0,002 |

“6” (correct prediction), appears in block #16, and is then amplified in the next blocks. To narrow down the analysis, I’ll initially focus on block #16.

zero ablation

To figure out where “6” output comes from, first I’ve zeroed MLP output in block #16. This did not change the result much:

14:  mul 0,023 |  Mul 0,014 | sono 0,010 |  tie 0,008 | vice 0,006 | 
15: sono 0,019 |  Mul 0,014 |  Pas 0,011 | vice 0,008 |  tie 0,006 | 
16: sono 0,015 | 6 0,013 | vice 0,009 | tring 0,008 |  kwiet 0,008 | 
17: 6 0,770 |  six 0,085 |  Six 0,054 |  sixth 0,022 | 六 0,007 | 
18: 6 0,817 |  six 0,074 |  Six 0,036 |  sixth 0,010 | 5 0,004 | 
19: 6 0,906 |  six 0,041 |  sixth 0,007 | 5 0,005 |  Six 0,004 | 
20: 6 0,934 |  six 0,016 | 5 0,009 |  sixth 0,006 |  Six 0,003 |

Next, I’ve tried to find attention heads that are responsible for the output — zero ablating attention heads (one at a time). The following image shows the probability of “6” token prediction for test with zeroing each of the 32 attention heads in block #16:

Head #24 is an outlier, and my guess is that it is responsible for copying “6” into the correct position.
Here is how this attention head’s weights look:

Indeed, attention for the last token which should predict “6” (bottom row), is focused on the 3rd token from the end, which is “6” in the input (bright pixel close to the bottom right corner)

next steps

This has been a small fraction the required analysis, and the approach has been quite naive. But at this point I have exhausted my knowledge of interpretability methods, which only include logit lens and zero ablation. I’ll continue trying to find the circuit responsible for solving test prompts, just need to learn more of MI.

references

Analogy as the Core of Cognition by Douglas R. Hofstadter
Moskvichev, Arseny et al. “The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain.” ArXiv abs/2305.07141 (2023): n. pag.
interpreting GPT: the logit lens
Li, Maximilian et al. “Circuit Breaking: Removing Model Behaviors with Targeted Ablation.” ArXiv abs/2309.05973 (2023): n. pag.

Bird-eye view visualization of LLM activations

2023-10-08T00:00:00Z

I’m starting to learn about mechanistic interpretability, and I’m seeing lots of great visualizations of transformer internals, but somehow I’ve never seen the whole large model’s internal state shown at once, on one image.

So I made this visualization, for Llama-2-7B. Attention matrices are on the left, in 32 rows for 32 blocks, top to bottom. To the right, there are 64 rows: residual stream (odd rows) and internal MLP activations (even rows). Finally, output MLP and unembedding layer are on the bottom.

Activation maps are downscaled horizontally, with maxpooling, to fit into 1000px wide image.

Example for a prompt “2+2=”:

And an example for the prompt: "William Shakespeare was born in the year”:

And for the prompt "blue pencils fly over moonlit toasters”:

Probably not especially useful for interpretability, but at least it looks pretty )

Browser-based simulator for Petoi Bittle robot dog

2023-10-07T00:00:00Z

The next iteration of my experiments with robot simulation is a demo of a simulator for Petoi Bittle, a neat palm-sized robot dog. Its size and price make it great for democratizing robotics research and education.

Thanks to Rongzhong Li of Petoi for inspiring this demo and providing the data for creating the robot model.

The new robot model is used with the quadruped simulator which Is based on urdf-loaders library, rapier.js physics engine and three.js 3d framework.

There are 2 robot modes in the demo below, the control model on the left and the simulation model on the right. The control model can be interactively manipulated, and it sets the pose of the simulation model, which is a proper physics/dynamics model.

Procedural generation of stepping and walking motion (described in the previous post) can be toggled with the “walk” button
Use WASD keys on the keyboard or on the screen for moving the robot.
Press the “T” key to switch between moving and rotating the control points.

The performance of the Rapier physics engine is really impressive, but making a realistic simulation with it is challenging. I wasn't yet able to set up joint motors to support the robot model upright when it’s standing or walking. It can be a problem with the settings of the motors, or an inherent issue of the implementation.

For now, to go around this limitation and make a usable demo, I had to reduce simulated world’s gravity and robot’s weight. In the next version I’ll have to either find a fix for Rapier joints, or move to a different physics engine like Physx.

GPT-4 for personal productivity: online distraction blocker

2023-09-24T00:00:00Z

There are many apps for blocking distracting websites: freedom.to, leechblock, selfcontrol, coldturkey, just to name a few. They are useful for maintaining focus, avoiding procrastination, and curbing addictive web surfing.

They work well for blocking a list of a few distracting websites. For me, this is not enough, because I’m spending a large portion of my time on a large number of websites, which I check out for a minute or two and then never visit again. It’s just impossible to maintain a blocklist for this long tail. Also, the web has grown so much that there are just too many easily found alternatives for any blocked distraction.

Well, GPT-4 to the rescue! With an LLM it’s possible to block websites based on the content, checking each page — if it’s distracting or useful/productive.

To test the idea I have implemented a prototype of a distraction filtering browser extension. This way, GPT-4 is turning into a personal productivity assistant!

Implementation

I just reused code from my previous similar project, which was a prototype of a parental control extension.

The extension sends the content of each loaded page to OpenAI API, and asks GPT if the page should be blocked. The prompt can be edited in the config window; the following prompt is used by default:

You are a smart web filter, a distraction blocker with goal of improving user's productivity.
Following types of pages are distracting and should be blocked:
entertainment, shopping, online stores, social networking, news, magazines, lists of links, blogs without technical/educational content.
Following types of pages are useful for work and should not be blocked:
software development, technical information, general reference, manuals, answers to technical questions
Should a web page with following content be blocked? Answer only YES or NO, followed with a newline and a brief explanation.
===
{text}

There is no backend yet, requests are sent directly from the browser. This requires using dangerouslyAllowBrowser option of the OpenAI API, and the user has to specify their own API key. Requests are cached in localStorage of the extension.

The code is on github.

Sensitive content, whitelist & blacklist.

While the extension is active, it sends a sample of each visited page’s content to OpenAI API. This might be a problem for pages with sensitive information.

You can add any domains which you do not want to expose to OpenAI to the whitelist or the blacklist. Pages that are matched are allowed or blocked without sending anything to OpenAI.

OpenAI is claiming to handle user data securely, and to not use data submitted via API for model training. Still, if you have any concerns about the privacy and security of the pages that you visit, and if you do not want to risk leaking your browsing history, avoid using this extension.

Installation and testing

To try it out, download the extension (github.com/coolvision/awf/releases/download/0.1/awf-0.1.zip), or build with npm:

npm install
npm run build

install it (instructions), and enter your API key in the extension’s config page.

Then, navigate to any page, it might get blocked:

Does it work?

I have been using it for a few days, and it does work quite well, with correct decisions in most cases.

One problem is that GPT-4 is expensive, and my usage has been up to ~$1/day. It would probably cost $10-30/month, which is not too much, but still a thing to improve.

Another issue is that OpenAI API is quite slow, it takes several seconds (up to 50-10s) to validate each page. I haven’t decided yet if it’s a feature or a problem — on one hand, it does make web browsing more mindful which is good, but then it does kill the flow/momentum when I want to quickly research something.

There should be simple ways to speed it up — It's probably not necessary to check each page, checking just a few pages per domain should be enough, but this needs more experimentation and testing.

ChatGPT-based parental control browser extension

2023-09-01T00:00:00Z

disclaimer: for now, this is just a proof of concept prototype, it is not ready to be used as a sole parental control application

Parental control apps usually rely on blacklists, which is not flexible enough for sites that are a mixed bag of useful and disturbing content, like Wikipedia. Some parental control apps (and Google’s SafeSearch) would use AI to filter content that’s “nudity, sexually explicit material, violence and gore, etc…”. But those are still not smart enough to block things like racism, bigotry, and disinformation.

What about using an LLM as a content filter for parental control? Delegating content supervision to GPT would allow for much more flexible and safe web exploration.

implementation

I started with the simplest approach — send page content to OpenAI API, and ask GPT if the page is all right.

This is my first attempt at prompt engineering, and here is what I came up with:

You are a parental control app, designed to protect children (up to age 13) from inappropriate content.
It's obvious what content should be blocked: violence, sexual, hateful, social networks, criminal activity, 
crimes, war crimes, atrocities, medical, drugs, explicit content, pornography, racism, bigotry, misinformation, pseudoscience etc...
Anything that would disturb a child. Please err on the safe side!
Should a web page with the following url and content be blocked? answer only YES or NO, followed with a newline and a brief explanation.
===
{text}

Where {text} is the first 1k words of the page content. The browser extension is built around this prompt.

I started with the Vite example project, and added CRXJS to turn it into an extension.
The code for sending requests, caching results in the extension’s localStorage, and blocking pages is just ~100 LOC.
There is no backend yet, requests are sent directly from the browser. This requires using dangerouslyAllowBrowser option of the OpenAI API, and the user has to specify their own API key.
The extension is targeted only for Chrome for now.
The code is on github.

testing

To try it out, download the extension release (github.com/coolvision/apc/releases/download/0.1/apc-0.1.zip), install it (instructions), and enter your API key in the extension’s config page.

Then, navigate to any page, it might get blocked:

After some testing I have found that gpt-3.5-turbo does not work for this task — too many obvious errors. gpt-4, on the other hand, works great! Some examples are shown in the table below, I marked in red decisions that I think are wrong.

GPT-4	GPU-3.5 Turbo
https://twitter.com block? YES The website represents a social network, which could expose a child to a range of potentially harmful or inappropriate content.	https://twitter.com block? NO This page is from Twitter and it is a social networking site. As per the given criteria, social networks should be blocked.
https://www.facebook.com block? YES This is a social networking website, which is generally inappropriate for children of this age range due to the potential for exposure to inappropriate content and privacy risks.	https://www.facebook.com block? NO The web page should not be blocked. While Facebook is a social network, it does not contain explicit or inappropriate content. It can be used by children to connect with friends and family, share photos and videos, and engage in other age-appropriate activities.
https://en.wikipedia.org/wiki/Warsaw_Uprising block? YES This page contains historical content about war, related violence and mass murder which would be inappropriate and disturbing for children under 13.	https://en.wikipedia.org/wiki/Warsaw_Uprising block? NO This webpage provides information about the Warsaw Uprising, which was a historical event during World War II. While it does mention war and destruction, it does not contain any explicit or disturbing content that would be inappropriate for children.
https://platform.openai.com/apps block? NO The webpage is about OpenAI's applications, language models, and APIs which are technological topics and tools, and it contains no content that would be inappropriate or harmful to a child.	https://platform.openai.com/apps block? YES The web page should be blocked because it is a login page for a platform that may contain content that is not suitable for children. As a parental control app, it is important to block access to any login pages or platforms that may potentially expose children to inappropriate content.
https://www.breitbart.com block? YES This webpage contains politically charged and potentially divisive content, which may be inappropriate for children aged 2-13. The use of complex political rhetoric and potential misinformation could be confusing and potentially distressing to a child.	https://www.breitbart.com block? NO While the content on the page contains news articles, it does not contain any explicit or inappropriate content that would disturb a child. While the content on the page contains news articles, it does not contain any explicit or inappropriate content that would disturb a child.
https://www.google.com/search?q=qanon block? YES The page contains links and information about QAnon, a group known for promoting baseless conspiracy theories. Given these generally involve harmful or disturbing claims, it's inappropriate for children.	https://www.google.com/search?q=qanon block? YES The content of the webpage includes information about QAnon, which is a conspiracy theory and political movement. Considering the guidelines given, this content should be blocked to protect children from exposure to potentially harmful and inappropriate information.

problems

There are several problems that have to be solved before this extension can be used in practice:

After the page is loaded, it’s content is hidden behind an overlay, while waiting for the LLM response. The problem is that there is a short delay before the overlay is added, so the page is visible briefly. This is not good for sites with prominent explicit graphical content. Probably, need to use WebRequest API to analyze content before the page is loaded?
GPT-4 works well, but each request costs ~$0.03. This has to be optimized, possibly with finetuning of a smaller/cheaper model. Another approach would be to use a cascaded system, where only some of the requests would be sent to GPT-4, while most decisions would be handled by a smaller model.
Installation is not very user-friendly, and sending requests from a browser is not great, the extension needs authentication and a dedicated backend.
OpenAI API is quite slow, it takes several seconds to validate each page. Although this might be a good feature, it makes web browsing more mindful.
The main issue is of course proper testing. It would be irresponsible to not validate a parental control application on a large dataset, to check if there are any random glitches and hallucinations affecting content classification accuracy.

Browser-based robot dog simulator: walking controller

2023-08-01T00:00:00Z

Previously, I have implemented a controller for a quadruped robot for stepping in place. It is a good starting point for implementing a controller for walking.

targets for placing feet

The only difference between stepping in place and walking — is selecting target points for lifting and planting feet.

An idea that is frequently used in procedural animation is to set this target points relative to the character’s torso. Then, when torso is moved, feet always automatically follow the torso, and step in the same direction.

Three.js already supports hierarchy of objects, so the only thing that's changing is putting the torso object as a parent for the feet targets objects.

For stepping in place, feet target points are attached to three.js scene, and do not change position:

this.step_targets[j] = new THREE.Mesh();
this.step_targets[j].geometry = new THREE.SphereGeometry(0.01, 30, 30);
this.step_targets[j].material = new THREE.MeshLambertMaterial();
scene.add(this.step_targets[j]);

And for walking, they are attached to the torso object, and move along with it:

this.step_targets[j] = new THREE.Mesh();
this.step_targets[j].geometry = new THREE.SphereGeometry(0.01, 30, 30);
this.step_targets[j].material = new THREE.MeshLambertMaterial();
this.targets["trunk"].add(this.step_targets[j]);

The video shows how feet targets (this.step_targets from the code above, shown with white spheres) move with the torso.

walking motion generation

In the post about stepping in place, I have posted the code for the motion generation. Code for walking motion is mostly the same — the only difference is transforming feet targets from local coordinates (relative to the torso), into world space coordinates.

Code for stepping in place:

[...]
// set the target for lifting the foot
// it's the same as the initial standing position
s.goal = this.step_targets[s.name].position.clone();
// only raised 0.1m above ground
s.goal.y += 0.1
[...]

And code modified to support walking:

[...]
// set the target for lifting the foot
// from the point attached to the torso
let w = new THREE.Vector3();
this.step_targets[s.name].getWorldPosition(w); 
s.goal = w;
// raised 0.1m above ground
s.goal.y += 0.1;
[...]

This approach to motion generation is quite flexible, it works for different variations of gaits, for different directions of walking, and for turning:

Code is on github, and an interactive demo with first-person control of the robot is below. Use WASD keys on keyboard or on the screen for moving the robot.

This is a nice and simple approach, and I’m quite happy with how compact the code is. There are limitations of course:

There is no sensing and no feedback loop, so if robots loses the balance, it will not recover and will most likely fall. A quadruped is quite a stable platform, but it's still easy to put it off balance by moving and turning abruptly.
Simulation parameters are approximate, I tweaked values for the robot’s weight and for the motors settings; support for realistic parameters needs more work
Only walking on flat surface is supported, any obstacles or elevation changes will lead to losing the balance

Procedural animation for controlling a (simulated) robot dog: first steps

2023-08-01T00:00:00Z

My previous post was about implementing a simulator for a quadruped robot. I got it to work, including inverse kinematics for controlling robot’s feet and torso poses. Next step is making the simulated model move and walk. How hard can it be?

procedural animation

The simplest way to generate walking motion is very similar to procedural animation. It imitates motion by setting poses of some of the character’s body parts and applying physics simulation to the rest. The character in this case is supported with external constraints, it’s more like a puppet instead of a fully simulated self-supported robot.

My idea is to use procedurally animated model to indirectly drive the “real” robot model. Just need to measure angles of the animated model’s joints, and use this angles as targets for motors of the actual simulated robot’s model.

stepping in place

Let’s start with just stepping in place. If robot’s legs are controlled with inverse kinematics, the the only thing needed is to raise and lower the feet. For the trot gait^[1], only one of the diagonal pairs of feet is moved, while another pair stays on the ground.

This can be managed with a simple state machine. Each of the feet goes through a sequence of states: ground→lift→plant→ground. Diagonal pairs of feet go into this sequence alternatively on every other step. Here is how the code looks:

export class UnitreeA1 {
    constructor() {
        [...]
        // identifiers for the feet: front left, rear right...
        this.feet_names =
            ["FL_foot", "FR_foot", "RL_foot", "RR_foot"];
        this.feet_states = {}; // structure for feet state machines
        for (let f of this.feet_names) {
            this.feet_states[f] = {
                id: "ground", // ground, lift, plant
                goal: false,  // target point for moving the foot 
                goal_distance: 0,
                name: f
            }
        }
        this.step_i = 0;
    }

   step() {
        let w = new THREE.Vector3();

        // count number of feet on the ground
        let n_ground = 0;
        for (let j of this.feet_names) {
            if (this.feet_states[j].id == "ground") n_ground++;
        }
        // only start the next step when all 4 feet are on the ground
        if (n_ground == 4) this.step_i++;

        // select which feet to move
        let selected = [];
        if (this.step_i % 2 == 0) {
            // on even steps, select front left and rear right feet
            selected[0] = this.feet_states["RR_foot"];
            selected[1] = this.feet_states["FL_foot"];
        } else {
            // on odd steps, select front right and rear left feet
            selected[0] = this.feet_states["RL_foot"];
            selected[1] = this.feet_states["FR_foot"];
        }

        if (n_ground == 4) { // when all 4 feet are on the ground
            // start the next step: put selected feet into the "lift" state
            for (let s of selected) {
                s.id = "lift";
                // set target point for lifting as default stading position
                // raised 0.1m above ground
                this.step_targets[s.name].getWorldPosition(w); 
                s.goal = w.clone();
                s.goal.y = 0.1; // raise the root
            }
        }

        // move feet positions towards goal points, by 0.05m at a time
        for (let j of this.feet_names) { 
            this.feet_states[j].goal_distance = 
                move_towards(this.targets[j].position, 
                             this.feet_states[j].goal, 0.05)
        }

        // if all feet have reached the goal points
        if (selected.every(v => v.goal_distance < 0.001)) {
            if (selected.every(v => v.id == "lift")) {
                // if reached the "lift" goal, switch to "plant"
                for (let s of selected) {
                    // the goal for planting the foot is detault standing position
                    // s.goal = this.step_targets[s.name].position.clone();
                    this.step_targets[s.name].getWorldPosition(w); 
                    s.goal = w.clone();
                    s.goal.y = 0.02;
                    s.id = "plant";
                }
            } else if (selected.every(v => v.id == "plant")) {
                // if reached the "plant" goal, the foot is on the ground 
                for (let s of selected) {
                    s.id = "ground";
                }
            }
        }
    }
    [...]

step() is called at 30Hz. Here is how the resulting motion looks:

There are two robot models in the video, model on the left is animated by setting it’s feet positions. Then it’s joints angles are transferred to the model on the right, which is the unconstrained full simulation model.

The simulation model bounces after stepping. Rapier.js physics engine supports only rigid bodies simulation, so the feet collide with the ground hard. In the real robot, feet pads are elastic, which provides damping. A quick fix is to add a short delay after planting the foot, at least it allows the model to settle. Here is how the model with the delay looks, in this case bouncing is reduced:

So this works well, but what happens when the robot tries to move forward while stepping in place? It will tilt and fall:

Code is on github, and a live demo is below:

Next, implementing walking motion generation.

Or, in case of stepping in place, this can be called a piaffe: "The piaffe (French pronunciation: [pjaf]) is a dressage movement where the horse is in a highly collected and cadenced trot, in place or nearly in place" ↩︎

Browser-based robot dog simulator

2023-07-01T00:00:00Z

I’m continuing to learn about robot simulation, and the next step is simulating a real robot, rather than a simplified platform.

I’ve chosen a quadruped (Unitree A1) for the next demo. First step is to load robot description from the official repository, using urdf-loaders library. Next, robot’s links and joints can be created in rapier.js physics engine for simulation and added to a three.js scene for visualization. For each loaded robot body part, we need to create:

rapier.js rigid body object and corresponding three.js mesh
rapier.js collider object
rapier.js joints for linking body parts together

Physics model connected with joints works well for inverse kinematics, when some of the model’s parts positions are set with 3d controls. We set positions of robot’s torso ans feet, and the rest of the parts and joints poses are then computed based on the physics simulation. Interactive demo is shown below. Let’s call this the control model.

Inverse kinematics can be used to drive another robot model, which actually simulates robot’s body dynamics. Let’s call this simulation model. This model’s parts poses are are not constrained, and are controlled by setting joints motors/angles. Angles values are obtained from measuring angles between body parts of the control model.

Next demo shows 2 models, control model on the left and simulation model on the right. Control model can be interactively manipulated, and it sets the pose of the simulation model, which is a proper physics/dynamics model.

The whole project is ~800 LOC, which is quite compact for a realistic 3d robot simulator. I do believe that existing robotics simulators are frequently over-engineered, and it looks that some parts of a simulator can be simplified by relying on WebGL and WASM.

Next, implementing a motion controller.

Browser-based robot simulator with three.js & rapier.js

2023-03-01T00:00:00Z

While learning more about embodied AI and robot simulation i’m building a simple simulator, based on Three.js and Rapier physics engine.

The robot model is a mobile platform with a 6DOF manipulator and a 2-finger gripper. Each part of the robot is a three.js mesh, kept in sync with the Rapier rigid body. Parts are connected into a kinematic chain with Rapier joints. Arm pose can be set by dragging the gripper with three.js TransformControls.

Surprisingly, inverse kinematics works just based on physics simulation, without the need for an additional explicit solver.

It even has advanced features like self-collision aware motion planning, example of self-collision avoidance is shown below:

Initially, I assumed that adding inverse kinematics would be the hardest part of the project, so it’s nice that it comes for free. Compared to a proper IK solver, physics based IK is less accurate, it's not optimal, and there is less control over resulting trajectories. But simplicity is still a big benefit.

Code is on github, and live demo is below.

Keyboard controls for the demo:

WASD: move the platform
G: open/close gripper
T, R: translate or rotate the gripper
Z: toggle zoom with scrolling