inside javascript's core - part 1

July 08, 2022


In my previous article we talked about what Javascript is. In this article I want to go deeper and start exploring how Javascript works focusing on the most famous Javascript engine: Google’s V8.

An "engine" is a self-contained, but externally-controllable, piece of code that encapsulates powerful logic designed to perform a specific type of work.

Every major browser comes with one already built-in…

and so on…

In this series I am going to talk about the V8 and today I will talk about Garbage Collection.

But, first a quick introduction.

What is this V8 thing?

Citing the official docs:

V8 is Google’s open source high-performance JavaScript and WebAssembly engine

It is written in C++ and it is embedded in Open Source projects like NodeJs and Deno.

It is what provides all the data types, operators, objects and functions.
It is what manages the memory of a Javascript program.
It is what optimizes the code to run faster.
It is what makes the thing you write, happen.

💡 - Did you know you can access V8 native syntax in NodeJs by passing the —allow-natives-syntax-flag?

Open Source

The V8 source code can be downloaded, studied and improved by everyone. The docs has also a section for everyone who would like to contribute to the development.

Build from Source
Mirror Repo
Contribute

Implements ECMAScript

What does it even mean?
What is ECMAScript?
In the article where I explained what Javascript is, I mentioned that Javascript conforms to the ECMAScript specification.

ECMA (European Computer Manufacturer’s Association) is an association that develops standards for technologies. ECMA-262 is a standard that contains the specification for a general purpose scripting language. This specification is called ECMAScript.

A standard is a document that provides rules, guidelines or characteristics for activities or their results, for common and repeated use. Standards are created by bringing together all interested parties including manufacturers, users, consumers and regulators of a particular material, product, process or service.

💡 - Scripting vs Programming

Manages Javascript Memory

The memory lifecycle for all programming languages can be summarized in:

Allocation —> Usage —> Release

Some languages require a manual allocation and release of memory

int* ptr;
ptr = new int; // Dynamically allocating memory for an int variable
*ptr = 12;

cout << *ptr << endl;
delete ptr; // Deallocate memory

Javascript does not.

So, how in the hell does Javascript handle memory?

Heap and Stack

Javascript engines have two places where they store data: the heap and the stack.

We can simply define the stack as the place where static data, including method and function frames primitives data types and pointers to objects get stored, while the memory heap is the place where the non primitves data types are stored. The Heap is where Garbage Collection takes place.

All variables first point to the stack. If it is not a primitive value, the stack contains a reference that points to the Object in the heap.

💡 - Why can I push elements in const Arrays ?

Garbage Collector

When we write our code and create primitives, objects, functions we just learned that it takes memory. So when is the cleaning performed?
In Javascript the engine does all the hard work for us.

The way V8 achieves that is through the implementation of some advanced concepts.

Reachability

If an object is currently reachable within the runtime must be kept, and any unreachable objects may be collected.

Marking

Marking is the process by which reachable objects are found

Starting from a set of known objects pointers, it follows each pointer to a Javascript object and marks that object as reachable. It then proceeds recursively to follow every pointer in all found objects until every object has been found and marked.
Sweeping

Sweeping is a process where gaps in memory left by dead objects are added to a data structure called a free-list.

A free list is a data structure generally used for dynamic memory allocation that connects unallocated regions of memory together in a linked list.
When new data has to be allocated, V8 looks at the free list to find an appropriate chunk of memory.

Compaction

Compaction is the process of copying surviving objects into other pages that are not currently being compacted (using the free-list for that page)


Although you do not need to worry about freeing memory everytime, this doesn’t exempt you from carefully reviewing your code.
Things such as badly implemented recursion or infinite loop can extinguish your available memory pretty quick and/or worse lead to memory leaks.



Generations

V8 splits the heap into different regions called Generations.

  • Young Generation - split further in Nursery and Intermediate
  • Old Generation

Starting from the nursery, if an object survive a Garbage Collection it is moved into Intermediate generation at first and then, after another survival to Old Generation.

But why is that?

Most objects die young

V8's generational heap is designed to exploit that assumption about objects' lifetime, the Generational Hypothesis.

Rememebering the concept of Compaction and assuming that objects are more likely to die in initial generations, then by moving only the objects that survive, V8 only pays a cost for copying that is proportional to the number of the surviving objects… and not the number of their allocations.


Orinoco is the codename of the Garbage Collector project to make use of the latest and greatest parallel, incremental and concurrent techniques for garbage collection, in order to free the main thread

Geez, what does that even mean?
Well, one metric in measuring the Garbage Collection is the time that the main thread spends paused while GC is performed.

In traditional Stop The World garbage collectors this time can add up and become a burden on user experience (latency or poor rendering), this is why Orinoco introdcued 3 techniques to make the collection more powerful.

  1. Parallel: is where the main thread and the helper threads divide, almost equally, the work. This is still stop the world approach but the pause time is now divided for the number of threads participating. There is no Javascript running.
  2. Incremental: the main thread doesn’t do an entire Garbage Collection, but just a small amount of work intermittently. This is more difficult because Javascript executes between each segment meaning that heap changes and thus giving birth to the risk of invalidating all the previous work done by GC.
  3. Concurrent: is where the main thread executes Javascript with no interruptions while the helper thread does Garbage Collection in the background. This has a similar risk to the Incremental GC techinque, plus constant races between the main and the helper thread to read/write objects.

V8’s stop-the-world, generational, accurate garbage collector is one of the keys to V8’s performance.

Conclusion

Phew...
I won't go deeper on how the Garbage Collection is implemented in V8 because such is a huge and complex topic and one article wouldn't be really enough. I introduced the main concepts behind V8 and how it allows us to write code without thinking too much about memory management (always keeping the best practices in mind though). V8 is an insanely huge piece of technology that implements a lot of clever concepts and I cannot wait to unveil them one by one.
See ya in next article.

References

How Javascript works
V8 Docs
Javascript memory
Memory Management

Want to read more?

Subscribe to the newsletter, opt-out anytime.

© 2022, Thomas Albertini