Featured image of post Neuroevolution To Play Super Mario Bros.

Neuroevolution To Play Super Mario Bros.

Introduction

In today’s article, I would like to tell you about a domain I find absolutely fascinating: Machine Learning. This article will show you how I implemented the NEAT Algorithm to play Super Mario Bros. on an emulated Nintendo Entertainment System (NES).

This article will cover theoretical parts, as well as the implementation I made in Rust. I’ve also adapted the project to compile it in Web Assembly, so you can see a fully functional demonstration of neural networks playing to the NES. The whole source code is available on my forgejo instance. At the end of the article, you can find some statistics I found interesting about the neuroevolution process.

Before diving in this article, I would like to personnally thank Lulu for the compute resources and the interest he had on the project. You can find his personal projects on his forgejo instance: https://git.az4aaz.xyz/explore/repos. I would also like to thank Roro, for the theoretical resources he shared with me. This article is just a bunch of things I’ve found really interesting to share with you from those resources, reformulated with my hands on the keyboard (no AI there). If you find this article boring or not covering parts you would like to dig in, please go directly to the bibliography section and find the articles bellow (mainly the deeplearning book, which is a treasure of knowledge Roro decided to share with me). If you need more information about the NEAT algorithm in particular, please find the article in the bibliography.

Disclaimer

I’m not associated with Nintendo Co. Ltd. I hope what I’m doing is legal, and if it’s not, meh, I suppose I’m sorry ?

Theory

What is the NEAT Algorithm?

NEAT stands for Neuroevolution of Augmenting Topologies. The algorithm it describes was designed to mimic the Evolution Theory, by Darwin. But to understand why and how it works, let’s dig in to what a Neural Network is.

Brief Introduction to Machine Learning And Neural Networks

Machine Learning is a field of Computer Science dedicated to the solving of problems difficult to describle formally, that most humans feel intuitive: e.g. recognizing a face, describe a drawing, suggesting an audio track. The idea behind machine learning is to let the computer learn from its own experience in order to avoid the need of a human operator to describe sequentially the steps to solve the problem.

To solve a problem using ML, we will need to extract features from the problem. Let’s take the example of playing Super Mario Bros. The screen of the NES contains lots of pixels, which, if taken individually, don’t give any relevant information. But the position of monsters or obstacles will be useful if we want the algorithm to finish the level: they are features (i.e. pertinent pieces of information describing the problem). The first step of solving the problem is therefore to define the right set of features. Those features are then used as inputs, to compute a reponse to the problem. In our example, the response will be a mathematical function corresponding to the reaction Mario should have to any obstacle. This reaction (formally called output) is wheather Mario has to jump, go left, right, down…

The output function $f$ has to approximate the value of $y$, the solution of the problem. $f$ will be computed by chaining other functions ($\sigma$, the sigmoid function, Rectified Linear Unit…, Hyperbolic Tangente etc.). The number of chained functions is called the depth of the neural network. That’s why if we consider there are a lot of chained functions, we talk about deep learning.

If you want to see an online demo of a classification problem (considered as difficult to describe formally), you can see the different functions we talked about in action right there : https://playground.tensorflow.org/

Why do we talk about Neural Networks?

The structure followed to solve the problem is inspired by neuroscience. We represent neural networks as graphs, every node applies a function as described bellow, and the values are mutilplied by a coefficient called weight while they go through a connection

The kind of neural network I’ve decided to tell you about is feed forward neural networks. A feed forward neural network lets information going from the input to the output, without allowing information to loop in the neural network. Recurrent Neural Networks allow this kind of process, letting a kind of memory emerge from the data recursion.

A feed forward neural network

This illustration is from wikipedia. In this figure, you can see the input data, represented as $x_i$, the coefficients of the input vector, called $X$. $w_{k_i}$ is the weight associated to the connection $i$ and $v_k$ is the node $k$. The activation function is $\varphi$ and the output is given in the node $y_k$.

I can hear a question you might want to ask : why do we use vectors here ?. Well the answer is simple: to optimise the computation processus. In fact, nowadays, we have powerful GPUs able to perform matrix multiplication way faster than if it were done sequentially. So, instead of doing manually

1
2
3
4
5
6
// [...]
for i in 0..x.len() {
	vk +=x[i]*w[k+i];
}
vk = phi(vk);
// [...]

we simply perform the matrixial product $X \times W_k$ using vectorial units with CUDA (or any similar API) and then apply the activation function to the result we got.

Well, that’s what professionals do. I didn’t in my implementation, to simplify the algorithm, but maybe i’ll do it in the future (with an FPGA ? why not! -> this is a reference to a future article)

The RNN Approach

Basically, the first NEAT implementation was not recurrent. But the creator of this algorithm, Kenneth Stanley, clearly said implenting NEAT as a Recurrent Neural Network wasn’t in opposition with the process followed by the algorithm. This is how a RNN looks like :

A recurrent neural network

This illustration also came from Wikipedia. As you can see, we need to unfold the neural network in order to be able to evalute it, or in other terms, to get the output vector $Y$. We will see later that the process is similar for Feed Forward Neural Networks, as we need to apply a topological sort (i.e. finding out in which order we need to feed, or to compute the value assigned to each neuron). The main difference is in a RNN, the same neuron will have different values after being fed. But this example is just informative, as I decided to implement the NEAT algorithm with a Feed Forward approach to simplify the process of evalutation.

Where is the evolution you talked about ?

This is the main difference between classical neural networks and those implementing the NEAT algorithm. The NEAT algorithm is the following :

  1. First, we create a population of randomly connected neural networks

  2. We check how well each neural network is able to solve the problem, by giving a score to each of them (using a fitness function we’ll see later)

  3. Then, we group the neural networks (also called individuals) in species, by checking how similar they are (unsing, similar connections, weights, etc.) species are subgroups used to preserve innovation (we will se how later)

  4. Each individual inherits the score of the best individual of the species (we will see why)

  5. The species getting a better score through iterations are preserved, the other ones are wiped out

  6. Individuals in the surviving species reproduice and the offspring get mutations (a node or a connection can be added, a weight can be altered)

  7. The algorithm starts again with the best individuals from this generation and the offspring, starting at step 1.

Let’s make it more clear.

At the step 0, we create a bunch of neural networks containing random connections / random weights (sometimes both, or we fully connect all the networks, it depends on the implementation). Then, we have to evalutate each neural network, and give it a score. This score is established by a fitness function. This function varies according to the problem we want to solve. In our example, the more Mario has walked a big $x$ distance on the level, the more the level is near to be finished. In other terms, we want to encourage the distance achieved by neural networks throught evolution. So the fitness function returns the $x$ distance the individual was able to walk.

In our case, the fitness function will spawn the emulator, extract the features, feed them to the neural network until Mario dies. Then, it will return the $x$ position mario achieved when Mario died.

If we wanted greedy neural networks, this function could return the number of coins earned.

The next big part is the speciation. To check if individuals belong to the same species, we compute the distance $\delta$ between the representant of a species and an individual. Before we continue on this principle, you need to know that every node and connection is identified by a number stored in a global database. Since we can add nodes and mutations to a neural network, we need to know if this new node or connection is an innovation or not. That’s why we keep track of those values globally, from the first to the last population. Let’s continue with the compatibility distance. This distance is computed as follow :

$$ \delta = c_1\cdot\frac{E}{N}+c_2\cdot\frac{D}{N}+c_3\cdot\bar{W} $$

$N$ is the number of genes in the larger genome. This value is used to nomalise the distance. $c_1$, $c_2$ and $c_3$ are used to adjust the impact of :

  • E : The number of excess genes. Excess genes are connections with a number not in the range of the id from the other genome

  • D : The number of disjoint genes. Disjoint genes are connections with an ID in the range of the other neural network, but not in its genotype.

  • $\bar{W}$ : The average weight difference

The crossover process in the NEAT algorithm

https://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf

Why do we group individuals in species ? since we can change the properties of neural networks through mutations, we can, just by changing a weight, destroy the performance of the neural network getting mutated. This is really bad if we want to promote innovation, since this performance loss isn’t representative at long term. Therefore, we get the fitness from the best individual as the fitness for all the individuals in the same species.

The last important thing with NEAT is the crossover, or reproduction. The figure bellow show you how the genes are inherited from parents to the offspring. Disjoint and excess genes are innherited from the fitest parent, and matching genes are inherited randomly. If the fitness is equal, every gene is inherited randomly.

Speciation and gene tracking through a global database are the two things making the NEAT algorithm particularly efficient to preserve innovation.

The crossover process gives a solution to the competing convention problem or “what genes should I choose without loosing the benefits from evolution?”

What we’ve seen until there

  • Neural Networks are just a way to let a computer learn autonomously from a dataset, composed of features.

  • Neural Networks are made of nodes and connections, chaining simple functions (sigmoid, ReLu, tanh) and coefficients (weights $w_k$) to solve problems that seems intuitive and therefore difficult to explain formally, like recommanding a music or describing a painting.

  • Those graphs, or networks are called neural because they are inspired by neuroscience.

  • There are two main types of Neural Networks : Feed Forward Neural Networks (also misnamed Multilayer Perceptron Neural Networks but anyways) and Recurrent Neural Networks. This article will cover an implementation I’ve made of a Feed Foward one.

  • To feed a Neural Network or a neuron is to give it some input data.

  • To evaluate a Neural Network is to feed it and to get the output data.

  • The NEAT algorithm combines neural networks with evolution theory, using speciation and crossover to preserve innovation and to solve the competing conventions problem

If everything seems clear to you, we can continue (: If not, please refer to this book and articles :

The implementation

Inspiration

The idea from this project is from a video I saw from the French Youtuber Laupok. I found this concept very innovative and wanted to do something similar by myself. I would like to thank Laupok, for his video, because it was the begining for me of a huge world built of discoveries and marvelous mathematical things.

Now we’re done with thanks, let’s hit the most interesting part !

The language

Rust, because it’s just beautiful and so well made. I also wanted webassembly support to be able to show you a little demo right there.

The architecture

To let a neural network play the NES, and in particular Super Mario Bros., we first need a NES. I didn’t build the NES core by myself, because I wanted to focus on the neuroevolution part. But this project helped my understand a lot how a NES works. The core is from the Tetanes Core library by lukexor. From this core, I built my own emulator. This emulator is a bit particular, since you can’t play with it. Its design was perhaps made for the neural network to play. Here is a part I found particularly interesting to show you this principle:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
loop {
    // get the timestamp at the begining of the loop
    let instant = Instant::now();
    get_screen(&control_deck, &mut buffer);
    biased_input[1..625].copy_from_slice(&buffer);
    biased_input[0] = 1.0;
	    
    let outputs = nn.feed_forward(&biased_input).unwrap();
    // send outputs a dedicated thread
    send_keys(&mut control_deck, outputs.as_slice().try_into().unwrap());
    //tx.send(outputs).unwrap();
		
    // clock frame
    control_deck.clock_frame().unwrap();
        
    deadline += 1.0;
        
    let pos = get_marios_position(&control_deck) as f64;
		
    if control_deck.wram()[0x000E] == 0x0B || pos <= deadline || pos >= 3200.0 {
        println!("fitness found : {}", pos);
        return pos;
    }
	    
    let mut frame_buffer = control_deck.frame_buffer().to_vec();
    let line_pos = coords_to_screen(&control_deck, deadline as usize, 0);
	    
    if line_pos.is_some(){
        draw_line(line_pos.unwrap(), &mut frame_buffer);
    }
        
    screen_tx.send(frame_buffer).unwrap();
		
    let refresh_rate = Duration::from_millis(16); // 60 fps means 16,66666666666... ms between each frame
    // we need to take in account the compute time to be sure there is effectively 16 ms
    // between each frame
    let fixed_time = instant.elapsed();
    let sleep = if fixed_time >= refresh_rate {
        println!("[WARN] the emulator is running under 60 fps. framerate is {:?}. consider user a lower max_parallel value", fixed_time);
        Duration::from_millis(0)
    }
    else {
        // 16ms minus the time elapsed before
        refresh_rate - fixed_time
    };
        
    std::thread::sleep(sleep);       
}

Well, this is basically the emulator, with maybe some missing definitions, but the main idea is there.

As you can see, the screen is extracted from the NES, and is then sent as the input of the neural network. We have some temporizations, to stay at 60 fps (the NES was running at 60 fps). And you might ask “but you said those pixels, taken individually, didn’t make sense to train the neural network”. And that’s true.

Extracting Game Data From The NES

As I told you before, we need to extract features from the screen. I’ve decided to define the following features :

  • Mario’s position
  • Monsters’ position
  • Blocks’ position
  • Walkable, or “empty” positions (air)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
pub fn get_screen(control_deck: &ControlDeck, buffer: &mut [f64; 624]) {
    // zeroize the buffer
    buffer.copy_from_slice(&[0.0f64; 624]);
		
    // number of screens that have according to mario's position
    let mario_screen_number: u16 = control_deck.wram()[0x006D] as u16; 
	// mario's position in the level modulo 256 (between 0 and 256)
    let mario_level_x = control_deck.wram()[0x0086] as u16;
    // mario's position in the screen  
    let mario_screen_x = control_deck.wram()[0x03AD] as u16; 
    // mario's y position
    let mario_screen_y = control_deck.wram()[0x3B8] as u16;  
		
    let mario_absolute_x = mario_screen_number as i64 * 256 + mario_level_x as i64; // mario's absolute position
    let left_edge_x: i64 = mario_absolute_x - mario_screen_x as i64; // left edge
	    
    // first layer : blocks
    for i in 0..DISPLAY_WIDTH_TILES {
        for j in 0..DISPLAY_HEIGHT_TILES {
            // overflow protection
            if left_edge_x >= 0 {
                // we divide by 16 to convert pixels to tiles
                let x = (left_edge_x / DISPLAY_WIDTH_TILES + i) % (DISPLAY_WIDTH_TILES * 2);
                if control_deck.wram()[coords_to_ram(x as u16, j as u16) as usize] != 0 {
                    buffer[DISPLAY_WIDTH_TILES as usize * j as usize + i as usize] = 1.0;
                }
            }
        }
    }
    // second layer : mario
    // mario's x starts at bottom left, we need to add a tile and to convert the result to a tile
    // number
    let mario_x: i64 = (mario_screen_x as i64 + 8) / 16;
    // same principle for y
    let mario_y: i64 = (mario_screen_y as i64 - 16) / 16;
    // the ram can contains weird values at start, so we need to check the values to prevent
    // overflow
    if mario_x < DISPLAY_WIDTH_TILES && mario_y < DISPLAY_HEIGHT_TILES as i64 && mario_x >= 0 && mario_y >= 0{
        buffer[DISPLAY_HEIGHT_TILES as usize * DISPLAY_WIDTH_TILES as usize + mario_y as usize * DISPLAY_WIDTH_TILES as usize + mario_x as usize] = 1.0;
        // is mario big ? if true, add a tile
        if control_deck.wram()[0x0754] == 0  && mario_y > 0 {
            buffer[DISPLAY_HEIGHT_TILES as usize * DISPLAY_WIDTH_TILES as usize + DISPLAY_WIDTH_TILES as usize*(mario_y as usize - 1) + mario_x as usize] = 1.0;
        }
    }
		
    // third layer : monsters
    for i in 0..5 {
        // if enemy i is drawn
        if control_deck.wram()[0x000F+i] == 1{
            let enemy_screen = control_deck.wram()[0x006E+i] as u16;
            let enemy_level_x = control_deck.wram()[0x0087+i] as u16;
            let enemy_x = (enemy_screen as i64 * 256 + enemy_level_x as i64 - left_edge_x + 8) / 16;
            let enemy_y = (control_deck.wram()[0x00CF+i] as i64 - 16) as i64 / 16;
			    
            if enemy_x < DISPLAY_WIDTH_TILES && enemy_y < DISPLAY_HEIGHT_TILES as i64 && enemy_x >= 0 && enemy_y >= 0{
                buffer[DISPLAY_HEIGHT_TILES as usize * DISPLAY_WIDTH_TILES as usize * 2 + (DISPLAY_WIDTH_TILES as usize * enemy_y as usize) + enemy_x as usize] = 1.0;
            }
        }
    }
}

To extract those features, we need to dump the RAM of the console. Geathering information about Super Mario Bros. RAM map was a bit difficult, until I found maps on the internet :

To compute Mario’s absolute position, we get values from 0x006D and 0x0086. The value at 0x0086 contains the number of screens that have scrolled from the begining of the level, and the adress at 0x006D contains Mario’s position modulo 256. As the NES screen is 256 pixels width, we can say mario’s position is

$$\text{value at RAM[0x0086]} \cdot 256 + \text{value at RAM[0x006D]}$$

From mario’s absolute position, we can compute the position of the left edge of the current screen (since the screen is loaded in the memory with a larger buffer than the screen size, we need to find out which part is in the current screen or not. In fact, the screen tiles are loaded in two pages, or two buffers, each matching the screen size ). The only piece information we need is Mario’s position in the screen, and it is stored at 0x03AD. So the left edge position is

$$\text{mario\_absolute\_x} - \text{RAM[0x03AD]}$$

From there, we do not work anymore with pixels, but with tiles Mario’s screen is filled with tiles of dimension $16\times16$ pixels.

Super Mario Bros. Tiles

https://www.spriters-resource.com/nes/supermariobros/asset/52571/

We then extract tiles from the ram, and put the features in three layers. Each layers has the dimension of the screen in tiles, filled with zero for the walkable, or air part, and a value matching the position of the feature, in tiles.

To do so, we use the coords_to_ram function. It converts (x;y) tiles coordinates from the screen into a RAM adress :

1
2
3
4
5
6
7
8
9
pub fn coords_to_ram(x: u16, y: u16) -> u16 {
    let page = x/16; // wheather if we have to read on the first or the second page
                     // first page : 0x0500 -> 0x05d0
                     // second page : 0x05d0 -> 0x069f
    let x = x%16;    // x coordinate
    let y = page*13 + y;
	
    0x500 + x + y*16
}

NEAT Implementation

I’ve decided to write my own library to implement the NEAT algorithm. You can find it here: https://git.anyn.one/Anynone/rennuos.git

This project contains to branches : a main one, designed to solve classical mathematical problems, and a maroil one, to play Super Mario Bros. on the NES.

One of the biggest problems I had was the training time. To solve it, I used chained threads. Instead of using classical scoped threads, I use an MPSC channel in each thread to tell if the thread has finished its work. If it is the case, a new thread spawns to evaluate a new neural network :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
fn compute_fitnesses(&mut self, max_parallel: usize) {
        // number of spawned threads
        let mut j = 0;
        // number of terminated threads
        let mut k = 0;
        
        let pop_size = self.individuals.len();
        
        thread::scope(|s| {
            let mut receivers: Vec<Receiver<bool>> = Vec::with_capacity(max_parallel);
            let mut senders  : Vec<Sender<bool>> = Vec::with_capacity(max_parallel);
            for _ in 0..max_parallel{
                let (tx, rx) : (Sender<bool>,Receiver<bool>) = channel();
                senders.push(tx);
                receivers.push(rx);
            }
            let mut individuals = self.individuals.iter_mut();
            for tx in senders.iter().cloned() {
                let individual = individuals.next().unwrap();
                s.spawn(move || {
                    individual.compute_fitness(tx);
                });
                j+=1;
            }

            loop{

                for (i, tx) in senders.iter().cloned().enumerate() {
                    match receivers[i].try_recv() {
                        Ok(..) => {
                            // one more indivual just finished!
                            k+=1;
                            // do we have to spawn another one ?
                            if j < pop_size {
                                // if we didn't go through the entire population, yes!
                                
                                let individual = individuals.next().unwrap();
                                s.spawn(move || {
                                    individual.compute_fitness(tx);
                                });
                                j+=1;
                            }
                        },
                        _ => {}
                    }
                }
                // number of terminated threads match the pop_size, the task is over
                if k >= pop_size { break; }
            }
        });
    }

The fitness function is the following :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74

pub fn fitness(nn: &mut NeuralNetwork, screen_tx: Sender<Vec<u8>>) -> f64 {
    let path = "../rom/mario.nes";
    let mut deadline = 0.0;
    // nes core
    let mut control_deck = ControlDeck::new();
    control_deck.set_filter(VideoFilter::Pixellate);
	    
    // load the rom with a specified path
    control_deck.load_rom_path(path).unwrap();
    control_deck.load_state("./init.sav").unwrap();
    // stick to the processor cycles  NES
    //pctd_ctrl_deck.lock().unwrap().set_cycle_accurate(true);
    //just for fun : http://nintendoforever.free.fr/Nes/SuperMarioBros1/SuperMarioBros1_Dossier/SuperMarioBros1_GameGenie.php
    //let _ = pctd_ctrl_deck.lock().unwrap().add_genie_code("SAGOOK".to_string());
    let mut buffer = [0.0f64; 624];
    let mut biased_input = [1.0f64; 624 + 1];
    loop {
        // get the timestamp at the begining of the loop
        let instant = Instant::now();
        get_screen(&control_deck, &mut buffer);
        biased_input[1..625].copy_from_slice(&buffer);
        biased_input[0] = 1.0;
			
        let outputs = nn.feed_forward(&biased_input).unwrap();
        // send outputs to the dedicated thread
        send_keys(&mut control_deck, outputs.as_slice().try_into().unwrap());
        //tx.send(outputs).unwrap();
			
        // clock frame
        control_deck.clock_frame().unwrap();
        // input management
        deadline += 1.0;
			
        let pos = get_marios_position(&control_deck) as f64;
			
        if control_deck.wram()[0x000E] == 0x0B || pos <= deadline || pos >= 3200.0 {
            println!("fitness found : {}", pos);
            return pos;
        }
			
        let mut frame_buffer = control_deck.frame_buffer().to_vec();
        let line_pos = coords_to_screen(&control_deck, deadline as usize, 0);
        if line_pos.is_some(){
            draw_line(line_pos.unwrap(), &mut frame_buffer);
        }
        

        // system gives us RGBA : 4 times more data than the amount of pixels
        //let mut out: &mut [u8] = &mut [0,0,0,0];
        //Video::decode_buffer((pctd_ctrl_deck.lock().unwrap().frame_buffer_raw()), out);
        //aff.update(None, out, DISPLAY_WIDTH*4).unwrap();
        // maj canva
        //canvas.copy(&aff, None, None).unwrap();
        // compute the best refresh rate
        screen_tx.send(frame_buffer).unwrap();
        //send_keys(&mut pctd_ctrl_deck.lock().unwrap(), outputs.as_slice().try_into().unwrap());
        let refresh_rate = Duration::from_millis(16); // 60 fps means 16,66666666666... ms between
                                                      // each frame
        // we need to take in account the compute time to be sure there is effectively 16 ms
        // between each frame
        let fixed_time = instant.elapsed();
        let sleep = if fixed_time >= refresh_rate {
            println!("[WARN] the emulator is running under 60 fps. framerate is {:?}. consider user a lower max_parallel value", fixed_time);
            Duration::from_millis(0)
        }
        else {
            // 16ms minus the time elapsed before
            refresh_rate - fixed_time
        };
		
        std::thread::sleep(sleep);
        
    }

As you can see here, I also added a deadline, because some individuals just walked and stopped, wating for the game time to be over. This was a big time loss during evalutation. This line will progress slower than mario when it walks, but will never stop. When the line position is the same as Mario’s position, we stop evaluation there with mario’s current position.

The Result

This result is from 1605 generations of neuro evolutions, computed in 3 days, thanks to Lulu’s puter. You can select the best individual from the generation $i, i \in [0, 1605]$ and watch it play with an emulated NES directly on you web browser, by clicking on “run”.

You can find the source code of this web version in my forgejo : https://git.anyn.one/Anynone/maroil-webviewer

The whole project source code is here : https://git.anyn.one/Anynone/maroil

enjoy!

Bibliography

https://www.deeplearningbook.org/ https://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf https://git.anyn.one/Anynone/rennuos.git https://www.spriters-resource.com/nes/supermariobros/asset/52571/ https://en.wikipedia.org/wiki/Feedforward_neural_network https://en.wikipedia.org/wiki/Recurrent_neural_network