Contents

🦀 Learning Rust basics - x509 Certificate Parser

Introduction

This article aim to explain Rust basics and how to create a Rust application. I manage to create a certificate parser because it covers a wide range of knowledge like:

  • Control flows ;
  • Managing/Parsing I/O ;
  • Error handling ;
  • and more…

I will take as input certificate issued on the Certificate Transparency (CT) system.

I am currently learning Rust, I am far from being an expert. I’m writing this article to allow me to go deeper into concepts and I’m open to any improvements and/or advice.

Note
You can DM me on Twitter or Discord: zzzep

Why Rust ?

Rust is a modern programming language that prioritizes efficiency, security, and concurrency. The Rust source code is compiled into native machine code. This makes runtime execution very fast even faster than some C code.

Rust includes automatic memory management, avoiding the need for a garbage collector. The compiler prepares memory allocations ahead of time by creating all the necessary instructions for managing memory during compilation. This technique eradicates memory corruption and security vulnerabilities at runtime, resulting in a highly secure language.

I choose Rust to do that x509 Certificate parser because I had a large volume of data. I already made a parser in python but the performance was not good for the amount of data I had to process.

What is a x509 certificate and Certificate Transparency ?

An X.509 certificate is a digital certificate that uses the X.509 standard to authenticate the identity of a person, organization, or device in a computer network. X.509 certificates are widely used in web browsers, email clients, and other applications that use secure communication protocols such as HTTPS, SSL, and TLS.

The Certificate Transparency Project is a system developed by Google and designed to improve the security of digital certificates used in the SSL/TLS encryption protocol. It provides a way to publicly log all issued SSL/TLS certificates, making it easier to detect and prevent fraudulent certificates and Certificate Authority (CA) compromises.

Info
At the time of writing this article, since 2013, more than 8,780,500,000 certificates have been logged.
Tip
If you want to go deeper on CT logs check out this blog by Tingmao Wang

Prerequisites & Rust concepts

Linux, macOS & Windows can be used here. Rust versatility allowing to create applications on multiple systems.

You may use the editor of your choice like Vim, VScode or CLion w/ the Rust plugin.

I will try here to explain some Rust concepts. If you want more Rust basics you can free get the Rust basics lesson from Zero-Point Security: RustForN00bs, this is a great beginning course.

Installing Rust

For Windows environment you can go here and download the rustup-init.exe file.

For linux, execute the following command to install Rust.

1
2
3
4
5
# Download & install Rust Compiler + Cargo
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Source binaries
export PATH="$PATH:$HOME/.cargo/bin"

If Rust is already installed, check for new version using the following command.

1
rustup update

Creating a new Rust project

To create a new Rust application you have to use the cargo command. This will generate a Cargo.toml, the Rust configuration file, and a main.rs file.

1
2
3
4
5
6
7
8
└─$ cargo new x509_certificate_parser
     Created binary (application) `x509_certificate_parser` package

└─$ tree x509_certificate_parser 
x509_certificate_parser
├── Cargo.toml
└── src
    └── main.rs

From here you can start writing your code into main.rs. By default, the file contains a Hello World!. You can compile and run the application using cargo run.

1
2
3
4
5
6
└─$ cd x509_certificate_parser 
└─$ cargo run                 
   Compiling x509_certificate_parser v0.1.0 (/mnt/hgfs/SHARE/RUST/Rust_tool/x509_certificate_parser)
    Finished dev [unoptimized + debuginfo] target(s) in 8.27s
     Running `target/debug/x509_certificate_parser`
Hello, world!

Rust crates and dependencies

In Rust, like in most languages, you have the possibility to use libraries (called crates in Rust). This allowing us to use code writed by the community. All crates are referenced in the Rust community’s crate registry.

You can add dependencies by using the cargo command.

1
2
3
└─$ cargo add rand
    Updating crates.io index
      Adding rand v0.8.5 to dependencies.

This will add the dependencies directly in the Cargo.toml file using the last version of the crates.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
└─$ cat Cargo.toml 
[package]
name = "x509_certificate_parser"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
rand = "0.8.5"

You can also add it manually if you want to use a specific version. Then, on your code, you can import dependencies and functions using the use statement.

1
2
3
4
5
6
use rand::random;

fn main() {
    println!("Hello, world!");
    println!("{}", random::<u32>());
}

For the example I take the random() function. When compiling and running we get the following result:

1
2
3
└─$ cargo run
Hello, world!
364563144

Rust modules

During the development of the application I will use modules. When, building larger and more complex application, you want to use modules. It allows to separate codes in multiple files and makes it much easier to maintain the code base.

This is where the lib.rs file comes in. When a user import a library using the use statement, they are importing the contents of the lib.rs file. This means that any modules, structs, functions, or other code that is defined in the lib.rs file and marked as pub will be available to the user of the library.

Let’s pretend you need to use several time an utility function, for example a function that print something. Let’s write the function into a utils.rs file.

1
2
3
4
// Content of utils.rs
pub fn print_name(name: &str) {
    println!("My name is {name}.");
}
Info
You can also directly write code on the lib.rs file.

Now to have access to this code, you have to create a lib.rs file and use the mod statement to import functions.

1
pub mod utils;
Info
Note that the function is marked as pub (public), allowing another source file to use it.

You should have the following arborescence:

1
2
3
4
5
6
└─$ tree .
├── Cargo.toml
├── src
│   ├── lib.rs
│   ├── main.rs
│   └── utils.rs

Finally, on your main script, you can import the print_name() function and use it on your code.

1
2
3
4
5
6
use x509_certificate_parser::utils::print_name;

fn main() {
    println!("Hello, world!");
    print_name("Pezzz");
}

You can now compile and run.

1
2
3
4
5
└─$ cargo run                      
    Finished dev [unoptimized + debuginfo] target(s) in 1.75s
     Running `target/debug/x509_certificate_parser`
Hello, world!
My name is Pezzz.

Rust Ownerships & Borrowing

Ownerships are one of the main concept that is different from other languages. In Rust, when defining a variable, only that variable has the ownership. If the value is moved (assigned to another variable), the value is no longer usable on the program.

Consider the following program.

1
2
3
4
5
6
7
fn main() {
    let value1 = String::from("Coucou");
    println!("{value1}");

    let _value2 = value1;
    println!("{value1}");
}

Here, we declare a variable value1 that take a String, the variable is then assigned to another value value2. If we try to use the moved value1 we will have the following error:

1
2
3
4
5
6
7
└─$ cargo run
error[E0382]: borrow of moved value: `value1`
 --> src/main.rs:6:16
5 |     let _value2 = value1;
  |                   ------ value moved here
6 |     println!("{value1}");
  |                ^^^^^^ value borrowed here after move
Info
Note that Rust compilator gives explicit error message and can give advices to resolve errors, that is cool :)

The variable value1 contains the heap address that point on the value Coucou. This can be explain with the schema below.

drawing

Then, the value is assigned to _value2, it becomes the new owner and value1 can no longer reference the value on the heap.

When a value is passed as an argument to a function or returned from a function, it is also moved, and the original variable can no longer be used. This mecanism guarantee memory safety without requiring a garbage collector or manual memory management.

For our example, one solution is to borrow the value by adding the & character.

1
2
3
4
5
6
7
8
fn main() {
    let value1 = String::from("Coucou");
    println!("{value1}");

    let _value2 = &value1;
    //____________^
    println!("{value1}");
}

Borrowing allow another variable to create temporary references to a value without transferring ownership. We can confirm that it works by running the program.

1
2
3
4
5
└─$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 1.89s
     Running `target/debug/x509-certificate-parser`
Coucou
Coucou

Error handling in Rust

Rust’s error handling is implemented using the Result and Option types, which are both enums that can have two possible variants:

  • Ok() or Err() for Result ;
  • Some() or None for Option.

The Result type is used to represent computations that may fail. It has two variants:

  • Ok(T), which holds a value of type T representing a successful computation ;
  • Err(E), which holds a value of type E representing an error.

For example, we define a read_file function that takes a file path as input and attempts to read and print the content of the file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::error::Error;
use std::fs::File;
use std::io::{BufRead, BufReader};

fn read_file(path: &str) -> Result<(), Box<dyn Error>> {
    let file = File::open(path)?;
    let reader = BufReader::new(file);

    for line in reader.lines() {
        let line = line?;
        println!("{}", line);
    }

    Ok(())
}

Here’s how the error handling works here:

  1. The File::open method returns a Result object that can have an Ok(file) variant if the file is successfully opened, or an Err(error) variant if the file cannot be opened.
Tip
Note: We use the ? operator to propagate any error that may occur to the caller of the function.
  1. If the file is successfully opened, we create a BufReader object that allow us to efficiently read the file.
  2. We iterate over each line using a for loop and the reader.lines() method that produces a Result<String, io::Error> object for each line. Note again the use of ? to propagate any error that may occur.
  3. If there are no errors, we print the contents of each line to the console.
  4. Finally, we return an Ok(()) value to indicate that the function completed successfully.

Then, we can use pattern matching with match to handle the result.

1
2
3
4
5
6
7
fn main() {
    match read_file("file.txt") {
        Ok(()) => println!("File successfully read"),
        Err(error) => eprintln!("Error reading file: {}", error),
    }
    println!("Errors are handled !")
}

This allow, by running the application to not panic on error:

1
2
3
4
5
└─$ cargo run  
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/x509-certificate-parser`
Error reading file: No such file or directory (os error 2)
Errors are handled !

Without this we would have had a panic which stops the execution of the program.

1
2
3
└─$ cargo run  
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:9:47
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It would work the same using Option type, except that we would had to use Some() and None enums.

Rust release mode

At that time, the compilation build an unoptimised and debug version. Rust compilator as the ability to produce a fully optimised binary with the use of the --release switch. The option significantly improve the performance of the application. This includes:

  • dead code elimination ;
  • inlining ;
  • loop unrolling.

This can highly increase the compilation time and memory usage but produce an optimized version of the application.

On bigger project, there is also the LTO (Link-Time Optimization) setting. By adding this on your Cargo.toml file, you can further improve performance. This allow to perform optimization across the entire program, rather than just optimizing individual compilation units (source files) independently.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Cargo.toml
[package]
name = "x509_certificate_parser"
version = "0.1.0"
edition = "2021"

[dependencies]
rand = "0.8.5"

[profile.release]
lto = true

We will compare compilation time and performance later on the article. Now that you know all that things, let’s start developing our x509 certificate parser !

Info
Other concepts will be explained during the parser development.

🛠️ Developing a Rust x509 Certificates Parser

During the development, I will mainly used the x509 parser crate from the Rusticata Project:

Needs & limitations

Input data format

When requestings CT logs, we get 2 fields: leaf_input & extra_data:

  • leaf_input refers to the raw, binary-encoded X.509 certificate that is being logged ;
  • extra_data is an optional field that can be included in a log entry to provide additional information about the certificate like metadata.
1
2
3
4
5
# Requesting Google Argon2023 logs
└─$ curl -s "https://ct.googleapis.com/logs/argon2023/ct/v1/get-entries?start=0&end=1" |jq -r '.entries[] | [.leaf_input, .extra_data] | @csv'

"AAAAAAFqSZV3lgAB43aJADBzoMZJzGVt6UbA...","AAUIMIIFBDCCAuygA..."
"AAAAAAFqSgNA/AAB43aJADBzoMZJzGVt6UbA...","AP+MIID+jCCAuKgAw..."

In addition to these fields, my collector add index number and the leaf hash of each certificate.

Tip
Leaf hash are used to verify the authenticity of X.509 certificates and preventing certificate fraud and misuse.

So our input data will have the following structure.

1
2
3
"index","base64_leaf_hash","leaf_input","extra_data"
"209918139","zbJbR6...","AAAAAAGD+kuKBAA...","AAqPAAUaMIIFF..."
[...]

Data are collected into multiple .csv file.

Output data format

The parser needs to extract all metadatas from each certificates. And output must be sorted by Pre-certificate | Certificate | CA certificate.

Indeed, a CT record contains a certificate chain which may consist of a pre-certificate or a certificate and one or more CA certificate (Certificate authority).

Tip
Pre-certificates allow website owners and CAs to detect potential issues with a certificate before it is issued and publicly available.

So at the end we want 3 output files:

  • precert.scsvh: file that contains all pre-certificates ;
  • cert.scsvh: file that contains all end certificates ;
  • ca.scsvh: file that contains all CA certificates.

All records from these files will be processed by another technology, here Apache Spark - Scala, which I could present in another article.

Certificate Data Structures

The first thing that we want to do is to decide which field we want to extract from certificates. With the help of the x509 Certificate RFC, I choose to extract all these metadata:

  • serial
  • fingerprint_sha256
  • fingerprint_sha1
  • issuer
  • subject
    • country
    • common_name
    • locality
    • organization
    • organizational_unit
    • state_province_name
  • extensions
    • subject_key_id
    • authority_key_id
    • basic_constraints
    • alternative_name
  • not_before
  • not_after

To this I add 3 more fields:

  • index: index number of the record in the CT log ;
  • log: name of the CT log ;
  • raw: base64 encoded raw certificate.

Creating data structures

Now that we have our fields, we can start creating structures. On our lib.rs file, we create 3 structures:

  • Metadata: contains all fields ;
  • Subject: contains all subject fields ;
  • Extension: contains all extensions.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#[derive(Debug)]
pub struct Metadata {
    pub index: String,
    pub log: String,
    pub serial: String,
    pub fingerprint_sha256: String,
    pub fingerprint_sha1: String,
    pub issuer: String,
    pub subject: Subject,
    pub extensions: Extensions,
    pub not_before: String,
    pub not_after: String,
    pub raw: String,
}

#[derive(Debug)]
pub struct Subject {
    pub subject_country: String,
    pub subject_common_name: String,
    pub subject_locality: String,
    pub subject_organization: String,
    pub subject_organizational_unit: String,
    pub subject_state_province_name: String,
}

#[derive(Debug)]
pub struct Extensions {
    pub subject_key_id: String,
    pub authority_key_id: String,
    pub basic_constraints: String,
    pub subject_alternative_name: String,
}

Implementations

Now that we have our structures, we can implement the new() methods for these to allow us to create instance of Metadata for each record. This can be do by using the impl keyword.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
impl Metadata {
    pub fn new(
        index: String,
        log: String,
        fingerprint_sha256: String,
        fingerprint_sha1: String,
        raw: String
    ) -> Self {
        Self {
            index,
            log,
            serial: String::new(),
            fingerprint_sha256,
            fingerprint_sha1,
            issuer: String::new(),
            subject: Subject::new(),
            extensions: Extensions::new(),
            not_before: String::new(),
            not_after: String::new(),
            raw,
        }
    }
}

Here the new() method takes 5 arguments, that are known before certificate metadata extraction. Other fields are initialized to empty String instances using String::new(). We can do that again for our both Subject & Extensions structure.

You can check that it work by trying to create an instance of Metadata and try to print the structure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
use x509_certificate_parser::Metadata;

fn main() {
    let metadata = Metadata::new(
        "index".to_owned(), 
        "log".to_owned(), 
        "fingerprint_sha256".to_owned(), 
        "fingerprint_sha1".to_owned(), 
        "raw".to_owned()
    );
    println!("{metadata:#?}");
}

This should provide the following result.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
└─$ cargo run  
    Finished dev [unoptimized + debuginfo] target(s) in 1.92s
     Running `target/debug/x509-certificate-parser`

Metadata {
    index: "index",
    log: "log",
    serial: "",
    fingerprint_sha256: "fingerprint_sha256",
    fingerprint_sha1: "fingerprint_sha1",
    issuer: "",
    subject: Subject {
        subject_country: "",
        subject_common_name: "",
        subject_locality: "",
        subject_organization: "",
        subject_organizational_unit: "",
        subject_state_province_name: "",
    },
    extensions: Extensions {
        subject_key_id: "",
        authority_key_id: "",
        basic_constraints: "",
        subject_alternative_name: "",
    },
    not_before: "",
    not_after: "",
    raw: "raw",
}
Tip

Note the use of #[derive(Debug)], a Rust attribute that automatically generates an implementation of the Debug trait for a given struct or enum. This allows us to print the Metadata structure.

:? the Debug formatter and the pretty-print # flag.

Reading data from csv files

Now that we have our structures, we need to find a way to get all records of multiple csv files. To read csv files we can use the CSV Crates.

But first, as we have multiple CSV files, we need to read the content of the directory that contains all files. To do so, we can use the read_dir() function from the Rust standard library.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
fn prepare_parsing(dir_path: &str) -> Result<(), Box<dyn Error>> {
    // check if path is a directory
    if Path::new(dir_path).is_dir() {
        let entries = read_dir(dir_path)?;

        // collect all files into a Vec
        let entries_path: Vec<DirEntry> = entries
            .into_iter()
            .filter(|f| f.is_ok())
            .map(|f| f.unwrap())
            .collect();

        // iterate over all files path and print these
        for entry_path in entries_path {
            println!("{entry_path:?}");
        }
        Ok(())
    }
    else {
        Err(std::io::Error::new(
            std::io::ErrorKind::NotFound,
            format!("Directory '{}' does not exist", dir_path)
        ).into())
    }
}

fn main() {
    let dir_path = "assets/test_dir";

    if let Err(e) = prepare_parsing(dir_path) {
        println!("ERROR: {}",e);
    }
}

Here, we define a prepare_parsing() function that take in argument the directory path of our files. First, we check if the path given is a directory, it can be done with the is_dir() method, if not throw an error.

Then, by using the read_dir() function, we get an iterator over all directory entries. We collect all entries into a Vector.

Info
By using filter() & map() we only get successfull entries.
1
2
3
4
5
└─$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 2.71s
     Running `target/debug/x509-certificate-parser`
DirEntry("assets/test_dir/entries_209915904-209919999.csvh")
DirEntry("assets/test_dir/entries_209920000-209924095.csvh")

We thus obtain a vector with all the files. Then, we create a read_entry() that allows to read all record from a file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
fn read_entry(entry: &DirEntry) -> Result<(), Box<dyn Error>> {
    let path = entry.path();

    // Build a CSV Reader
    let mut rdr = ReaderBuilder::new()
        .has_headers(true)
        .from_path(path)?;

    // iterate over all record
    for record in rdr.records() {
        let record = record?;
        println!("First field {} Second field {}", &record[0], &record[1]);
    }
    Ok(())
}

We create a ReaderBuilder that allow us to specify reader configuration, here we set header option at true as each files has a header. Then, we iterate over line and print each field of each line.

1
2
3
└─$ cargo run
First field 209918139 Second field zbJbR6+FtCth2iBy+kAE7SD+GrkEA0jUpZPRtynRINM=
First field 209918140 Second field ziMPj+QLYo+GjGRDkLt6IIxbHmi7pM3OI8WKboYNNEk=

Parsing a x509 certificate chain

Now that we can read records, we can start our parsing by trying to parse the x509 chain of each record.

To do so, I create the function get_x509_chain() that take two arguments, the name of the log and a StringRecord which corresponds to a CSV line.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
fn get_x509_chain(log: String, record: StringRecord) -> Result<(), Box<dyn Error>> {
    // parse CSV fields    
    let index: &str = &record[0];
    let base64_leaf_hash: &str = &record[1];
    let leaf_input: &str = &record[2];
    let extra_data: &str = &record[3];
    
    let leaf: Leaf = get_leaf(leaf_input, extra_data)?;
    Ok(())
}

The function parse all fields and send leaf_input and extra_data to the get_leaf() function.

Tip
a leaf certificate refers to the end-entity certificate that is issued to a specific domain or organization.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
fn get_leaf(leaf_input: &str, extra_data: &str) -> Result<Leaf, Box<dyn Error>> {
    // get Leaf struct
    let leaf_input = general_purpose::STANDARD.decode(leaf_input)?;
    let extra_data = general_purpose::STANDARD.decode(extra_data)?;
    let leaf: Leaf = Leaf::from_raw(
        leaf_input.as_slice(), 
        extra_data.as_slice()
    )?;
    Ok(leaf)
}

Here, I use the crate ctclient from Tingmao Wang that contain the Leaf struct with a method that parse a leaf from leaf_input and extra_data.

But here, trying to run the program we get the following error.

1
2
3
4
5
6
7
└─$ cargo run

error[E0277]: the trait bound `ctclient::Error: std::error::Error` is not satisfied
  --> src/main.rs:20:6
   |
20 |     )?;
   |      ^ the trait `std::error::Error` is not implemented for `ctclient::Error`

We can’t propagate error from Leaf::from_raw() method because it use a ctclient::Error which does not implement std::error::Error. To resolve this problem we will implement our own error handling.

Create your own error handling system

To create your own error handling, we first define a new enum. I will create this on the lib.rs file.

I will take as example the first error that we handle. Do you remember the first check to see if the given path is a directory? Let’s implement our own error for that first.

We define a new ParserErrors enum. This one contains a IsNotDir element which expects an input string.

1
2
3
4
5
6
7
// from lib.rs file

#[derive(Debug)]
pub enum ParserErrors {
    // path is not a dir
    IsNotDir(String),
}

On our main function, at the top we declare a new type ParserResult.

Then, we change the function return to ParserResult<()>. On the else statement we can then return our custom error providing the directory path tested.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// from main.rs file

type ParserResult<T> = Result<T, ParserErrors>;

fn prepare_parsing(dir_path: &str) -> ParserResult<()> {
    // check if path is a directory
    if Path::new(dir_path).is_dir() {
        let entries = read_dir(dir_path)?;
        [...]
        Ok(())
    }
    else {
        Err(ParserErrors::IsNotDir(dir_path.to_owned()))
    }
}

At this point the error handling is implemented for that error but we can’t actually print error because our enum does not implement the Display trait. To do so, we can write the following code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// from lib.rs file

impl std::fmt::Display for ParserErrors {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        use ParserErrors::*;
        match self {
            IsNotDir(path) => {
                write!(f, "Path '{}' is not a directory !\n", path)
            }
        }
    }
}

Now you can easily create new type of error by adding it on the enum. Now, on code, if you want to propagate error with the ? statement you can use the map_err() method.

1
2
3
4
5
6
7
8
9
// from main.rs file

fn prepare_parsing(dir_path: &str) -> ParserResult<()> {
    // check if path is a directory
    if Path::new(dir_path).is_dir() {
        let entries = read_dir(dir_path)
            .map_err(|_e| ParserErrors::DirNotFound(format!("{dir_path}")))?;
        [...]
}

Going back on our certificate chain parsing

Now that error can be handled we can parse the leaf using Leaf::from_raw() and propagate the error.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// from main.rs file

fn get_leaf(leaf_input: &str, extra_data: &str) -> ParserResult<Leaf> {
    // base64 decode fields
    let leaf_input = general_purpose::STANDARD.decode(leaf_input)
        .map_err(|e| ParserErrors::DecodeFail(format!("{e}")))?;
    let extra_data = general_purpose::STANDARD.decode(extra_data)
        .map_err(|e| ParserErrors::DecodeFail(format!("{e}")))?;
    
    // get Leaf struct
    let leaf: Leaf = Leaf::from_raw(
        leaf_input.as_slice(), 
        extra_data.as_slice()
    ).map_err(|e| ParserErrors::LeafParseFail(format!("{e}")))?;
    Ok(leaf)
}

fn get_x509_chain(log: String, record: StringRecord) -> ParserResult<()> {
    // parse CSV fields    
    let index: &str = &record[0];
    let base64_leaf_hash: &str = &record[1];
    let leaf_input: &str = &record[2];
    let extra_data: &str = &record[3];
    
    let leaf: Leaf = get_leaf(leaf_input, extra_data)
        .map_err(|e| ParserErrors::LeafParseFail(format!("{e}")))?;

    println!("{leaf:?}");
    Ok(())
}

Running that we can print each leaf.

1
2
3
4
5
6
└─$ cargo run

Leaf(cdb25b47af85b42b61da2072fa4004ed20fe1ab9040348d4a593d1b729d120d3)
Leaf(ce230f8fe40b628f868c644390bb7a208c5b1e68bba4cdce23c58a6e860d3449)
Leaf(848a43e264490a451a845be7bc0688f61eef992b4dbb8de46f1443ecab467ade)
Leaf(84e6fd91c63df09c998ede1355bd7a6d0d2fbdbb64962d9dafbdca245c83654d) (pre_cert)

The Leaf struct defined on the ctclient crate is composed of multiple elements.

1
2
3
4
5
6
7
8
9
pub struct Leaf {
    pub hash: [u8; 32],
    pub timestamp: u64,
    pub is_pre_cert: bool,
    pub x509_chain: Vec<Vec<u8>>,
    pub tbs_cert: Option<Vec<u8>>,
    pub issuer_key_hash: Option<Vec<u8>>,
    pub extensions: Vec<u8>,
}

On that struct, we will use both x509_chain & is_pre_cert:

  • x509_chain: contains the certification chain ;
  • is_pre_cert: a boolean.

The x509_chain is a vector so we can iterate over certificates from the chain.

The library define that the first certificate of the chain is the end entity cert (or pre-cert, if is_pre_cert is true), and the last is the root CA. Knowing that, we can, as wanted, separate each types of certificate (Cert/Precert/CA). But first we will try to extract all metadata from each certificate.

Parsing certificate metadata

Fingerprinting certificates

As our struct expect SHA256 & SHA1 fingerprints, we need first to get the raw certificate as a slice of bytes. To do so, we can use the as_slice() function.

1
2
3
4
5
for cert in leaf.x509_chain {
        // converting a Vec of u8 to a slice of u8
        // allowing use to calculate fingerprints
        let cert = cert.as_slice();
}

Here we iterate over the x509 chain and convert each certificate to a slice. Then, we can create SHAs object like so.

1
2
3
4
5
6
// create a Sha256 object
let mut sha256_hasher = Sha256::new();
sha256_hasher.update(cert);
let sha256 = sha256_hasher.finalize();

let fingerprint_sha256 = format!("{:X}", sha256)

This produce an array of bytes that you can format using the formatter {:X}. This will produce a hexadecimal string.

For greater readability, I create both SHAs functions that I put in an utils.rs file.

Encoding raw certificate

As we want to save the raw certificate we can encode these into base64 using the base64 library.

1
2
// get base64 encoded raw cert
let raw = general_purpose::STANDARD.encode(cert);

Creating instance of Metadata

Now, we can create instance of Metadata, the structure that we create earlier as we have all the input arguments.

1
2
3
4
5
6
7
8
// Instanciate Metadata
let metadata = Metadata::new(
    index.to_string(),
    log,
    fingerprint_sha256,
    fingerprint_sha1,
    raw
);

Parsing Metadatas from certificates

At that time, we have the following code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
fn parse_x509_chain(log: &str, index: &str, leaf: Leaf) -> ParserResult<()> {
    let is_precert = leaf.is_pre_cert;
    let mut idx_chain = 0;

    // iterate over the certification chain
    for cert in leaf.x509_chain {
        // converting a Vec of u8 to a slice of u8
        // allowing use to calculate fingerprints
        let cert = cert.as_slice();
        
        // get fingerprints
        let fingerprint_sha256: String = get_sha256(cert);
        let fingerprint_sha1: String = get_sha1(cert);
        // get base64 encoded raw cert
        let raw = general_purpose::STANDARD.encode(cert);
        
        // Instanciate Metadata structure
        let init_metadata = Metadata::new(
            index.to_string(),
            log.to_string(),
            fingerprint_sha256,
            fingerprint_sha1,
            raw
        );
    }
    Ok(())
}

We can start x509 certificates parsing. To do so, I will use the parse_x509_certificate function that take a slice of bytes in input and return a result X509Certificate struct that contain a TbsCertificate object that contain all the metadata of a certificate as defining in the RFC-5280.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
TBSCertificate  ::=  SEQUENCE  {
    version         [0]  EXPLICIT Version DEFAULT v1,
    serialNumber         CertificateSerialNumber,
    signature            AlgorithmIdentifier,
    issuer               Name,
    validity             Validity,
    subject              Name,
    subjectPublicKeyInfo SubjectPublicKeyInfo,
    issuerUniqueID  [1]  IMPLICIT UniqueIdentifier OPTIONAL,
                         -- If present, version MUST be v2 or v3
    subjectUniqueID [2]  IMPLICIT UniqueIdentifier OPTIONAL,
                         -- If present, version MUST be v2 or v3
    extensions      [3]  EXPLICIT Extensions OPTIONAL
                         -- If present, version MUST be v3
}

The function parse_x509_certificate return a Result so we need to handle the result and get only Ok() result like so:

1
2
3
4
5
6
7
8
9
// Metadata parsing
let res = parse_x509_certificate(cert);
match res {
    Ok((_rem, certificate)) => {
        let metadata: Metadata = parse_metadata(init_metadata, certificate)
            .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;
    },
    Err(err) => Err(ParserErrors::CertParseFail(format!("{err}")))?
}

Then, we create the function parse_metadata() that take in input our Metadata struct and a X509Certificate generated by parse_x509_certificate.

On this function we define two variable subject_meta & x509_ext that take the return of two new functions:

  • get_subject_meta(): parse subject fields, return a Subject struct ;
  • get_extensions(): parse extensions, return a Extensions struct.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
fn parse_metadata(init_metadata: Metadata, certificate: X509Certificate) -> ParserResult<Metadata> {
    // get Certificate Subject Metadatas
    let subject_meta: Subject = get_subject_meta(certificate.subject())
        .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;

    // get Certificate Extensions
    let x509_ext: Extensions = get_extensions(&certificate.tbs_certificate)
        .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;

    let not_before = format_date(certificate.validity().not_before.timestamp())
        .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;
    let not_after = format_date(certificate.validity().not_after.timestamp())
        .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;
    
    let metadata = init_metadata;
    Ok(metadata)
}
Parsing Subject metadata

The function take as input a X509Name object. This object has an iter_attributes() method that allow us to iterate over each attribute. We then filter attributes by types, then we can fill in the elements found in the Subject structure. We then obtain the following function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fn get_subject_meta(subject: &X509Name) -> ParserResult<Subject> {
    let mut subject_meta = Subject::new();
    subject.iter_attributes().for_each(|attr| {
        if attr.attr_type() == &OID_X509_COUNTRY_NAME {
            subject_meta.subject_country = attr.as_str().unwrap_or("").to_string();
        }
        else if attr.attr_type() == &OID_X509_COMMON_NAME {
            subject_meta.subject_common_name = attr.as_str().unwrap_or("").to_string();
        } 
        [...]
    });
    Ok(subject_meta)
}

Here, I used the unwrap_or() that return the Some() value if so or a default value (here an empty string).

Finally, the Subject struct is returned.

Parsing Extensions metadata

The get_extensions() function take as input the TBS Certificate. This, has a extensions() method that can be used to iterate over each extensions using into_iter().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
fn get_extensions(tbs_certificate: &TbsCertificate) -> ParserResult<Extensions> {
    let mut extensions = Extensions::new();
    tbs_certificate.extensions().into_iter().for_each(|ext| match &ext.parsed_extension() {
        ParsedExtension::SubjectKeyIdentifier(ski) => {
            // format key into hex
            let format_ski = ski.0.iter()
                .map(|b| format!("{:02X}", b))
                .collect::<Vec<String>>()
                .join(":");

            extensions.subject_key_id = format_ski;
        },
        ParsedExtension::AuthorityKeyIdentifier(aki) => {
            if let Some(aki) = &aki.key_identifier {
                // format key into hex
                let format_aki = aki.0.iter()
                    .map(|b| format!("{:02X}", b))
                    .collect::<Vec<String>>()
                    .join(":");
                
                extensions.authority_key_id = format_aki;
            }
        },
        ParsedExtension::BasicConstraints(bc) => {
            extensions.basic_constraints = format!("CA:{}", bc.ca);
        }
        ParsedExtension::SubjectAlternativeName(san) => {
            let alt_name = get_alt_name(&san.general_names).unwrap_or(String::new());
            extensions.subject_alternative_name = alt_name;
        }
        _ => (),
    });
    Ok(extensions)
}

Authority and Subject Key Identifier are formatted into a hexadecimal format. By default, the structure give both [u8] array. I used .map(|b| format!("{:02X}", b)) to iterate and format each element of the array.

Basic Constraints are simply formatted into a String.

Then, Alternative Name are parsed using the get_alt_name() function that take as input a vector of GeneralName. The GeneralName is defined as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
pub enum GeneralName<'a> {
    OtherName(Oid<'a>, &'a [u8]),
    RFC822Name(&'a str),
    DNSName(&'a str),
    X400Address(UnparsedObject<'a>),
    DirectoryName(X509Name<'a>),
    EDIPartyName(UnparsedObject<'a>),
    URI(&'a str),
    IPAddress(&'a [u8]),
    RegisteredID(Oid<'a>),
}

I made the choice to take only DNSName, IPAddress and RFC822Name (Email addresses). It gives the following code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
fn get_alt_name(names: &Vec<GeneralName>) -> ParserResult<String> {
    let mut alt_name: Vec<String> = Vec::new();

    names.into_iter().for_each(|name| match name {
        // get email
        GeneralName::RFC822Name(mail) => {
            alt_name.push(mail.to_string());
        },
        // get DNS name
        GeneralName::DNSName(name) => {
            alt_name.push(name.to_string());
        },
        // get ip address
        GeneralName::IPAddress(ip) => {
            let ip: String = format_ip(ip).unwrap_or(String::new());
            alt_name.push(ip);
        },
        _ => (),
    });
    Ok(alt_name.join(","))
}

We declare a new vector alt_name. Then, we iterate over each GeneralName and push values into the vector. If the value is an IP address, I made a little format_ip() function to format it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// function use to format ip address
pub fn format_ip(ip: &[u8]) -> Result<String, Box<dyn std::error::Error>> {
    match ip.len() {
        4 => Ok(IpAddr::from(<[u8; 4]>::try_from(ip)?).to_string()),
        16 => Ok(Ipv6Addr::from(<[u8; 16]>::try_from(ip)?).to_string()),
        _ => Err(Box::new(io::Error::new(
            io::ErrorKind::InvalidData, 
            "Cannot format ip address",
        ))),
    }
}

As the library gives IP addresses in the form of an array of bytes ([u8]), we can format these using IpAddr & Ipv6Addr from the standard library. If the length of the array is 4, it means that it is an Ipv4 so we use IpAddr and if the length is 16 Ipv6Addr.

I put the function in the utils.rs file.

Complete our Metadata structure

Now that we parse all metadata, we can enter these on our Metadata structure define earlier.

1
2
3
4
5
6
7
8
9
let metadata = Metadata {
    serial: certificate.tbs_certificate.raw_serial_as_string(),
    issuer: format!("{}", certificate.tbs_certificate.issuer),
    subject: subject_meta,
    extensions: x509_ext,
    not_before: not_before,
    not_after: not_after,
    ..init_metadata
};

I create another utils function format_date(). The function take as input an i64 type (corresponding to timestamp) and format it into this format: %Y-%m-%dT%H:%M:%SZ.

1
2
3
4
5
6
// function use to format Date to YYYY:MM:dd HH:mm:ss 
pub fn format_date(timestamp: i64) -> Result<String, Box<dyn std::error::Error>> {
    let naive_datetime = NaiveDateTime::from_timestamp_opt(timestamp, 0).ok_or("Failed to format timestamp.")?;
    let datetime = DateTime::<Utc>::from_utc(naive_datetime, Utc);
    Ok(datetime.format("%Y-%m-%dT%H:%M:%S%.3fZ").to_string())
}

The parsing is over, you can now print each certificate. It should give the following result:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Metadata {
    index: "209918139",
    log: "GOOGLE_ARGON2023",
    serial: "03:b6:0d:5d:35:80:71:f6:93:2f:87:21:e8:93:8e:15:4d:7e",
    fingerprint_sha256: "3752A7F3358BCBC5401CD7FD21419558ECA5F3D46516794FFEB0EA5E5641DD37",
    fingerprint_sha1: "86405E3C8BDBFD5B9A77CC10D2DD6CCEEF8CA038",
    issuer: "C=US, O=Let's Encrypt, CN=R3",
    subject: Subject {
        subject_country: "",
        subject_common_name: "neogym-shop.com",
        subject_locality: "",
        subject_organization: "",
        subject_organizational_unit: "",
        subject_state_province_name: "",
    },
    extensions: Extensions {
        subject_key_id: "B9:D3:2E:A7:9A:24:26:34:C5:00:D0:10:B9:10:45:AB:BF:30:D8:D9",
        authority_key_id: "14:2E:B3:17:B7:58:56:CB:AE:50:09:40:E6:1F:AF:9D:8B:14:C2:C6",
        basic_constraints: "CA:false",
        subject_alternative_name: "neogym-shop.com",
    },
    not_before: "2022-10-21T10:26:37Z",
    not_after: "2023-01-19T10:26:36Z",
    raw: "MIIFJTCCBA2gAwIBAgISA7.....",
}

Writing results to file

Creating output files

Before writing result, we need to create 3 output path, one for each certificate type. I create the following function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// function use to create a file for each object (precert, cert, ca)
fn create_result_file(output_path: &str) -> ParserResult<()> {
    let objects: Vec<&str> = vec!["precert", "cert", "ca"];

    for object in objects.iter() {
        // Initialise paths
        let opath = format!("{output_path}/{object}.scsvh");

        // Create output files
        let _f = File::create(opath)
            .map_err(|e| ParserErrors::CreateFileFail(format!("{output_path}"), format!("{e}")))?;
    }
    Ok(())
}

This function simply use the standard library to write a file for each type of certificate.

Creating an iterator over Metadata

In order to write each element of our Metadata struct we need to implement the IntoIterator trait for our struct. This will allow us to iterate over each element of the struct. To do so, on lib.rs, I write the following implementation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
pub struct MetadataIter<'a> {
    metadata: &'a Metadata,
    index: usize,
}

impl<'a> Iterator for MetadataIter<'a> {
    type Item = &'a str;

    fn next(&mut self) -> Option<Self::Item> {
        let fields = [
            &self.metadata.index,
            &self.metadata.log,
            &self.metadata.serial,
            &self.metadata.fingerprint_sha256,
            &self.metadata.fingerprint_sha1,
            &self.metadata.issuer,
            &self.metadata.subject.subject_country,
            &self.metadata.subject.subject_common_name,
            &self.metadata.subject.subject_locality,
            &self.metadata.subject.subject_organization,
            &self.metadata.subject.subject_organizational_unit,
            &self.metadata.subject.subject_state_province_name,
            &self.metadata.extensions.subject_key_id,
            &self.metadata.extensions.authority_key_id,
            &self.metadata.extensions.basic_constraints,
            &self.metadata.extensions.subject_alternative_name,
            &self.metadata.not_before,
            &self.metadata.not_after,
            &self.metadata.raw,
        ];
        if self.index >= fields.len() {
            None
        } else {
            let field = fields[self.index];
            self.index += 1;
            Some(field)
        }
    }
}

impl<'a> IntoIterator for &'a Metadata {
    type Item = &'a str;
    type IntoIter = MetadataIter<'a>;

    fn into_iter(self) -> Self::IntoIter {
        MetadataIter {
            metadata: self,
            index: 0,
        }
    }
}

First, we create a MetadataIter struct that will be returned by into_iter() method. Then, we implement Iterator trait for the new struct MetadataIter.

After, we define the next() method that is required by the Iterator trait. On the definition, we list all our fields in any order you like.

Finally, we implement the IntoIterator trait for Metadata that will return a MetadataIter iterator. This allow you to iterate over our 3 structs (Metadata, Subject and Extensions).

Writing metadata

Then, on the parse_x509_chain function, I write the following code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Metadata parsing
let res = parse_x509_certificate(cert);
match res {
    Ok((_rem, certificate)) => {
        let metadata: Metadata = parse_metadata(init_metadata, certificate)
            .map_err(|e| ParserErrors::MetadataParseFail(format!("{e}")))?;

        // The first cert is the end entity cert (or pre cert, if is_pre_cert is true), 
        // and the last is the root CA.
        if is_precert {
            if idx_chain == 0 {
                // write PRECERT
                write_cert_metadata(metadata, "precert", output_path)?;
            } else {
                // write CA
                write_cert_metadata(metadata, "ca", output_path)?;
            }
        } else {
            if idx_chain == 0 {
                // write CERT
                write_cert_metadata(metadata, "cert", output_path)?;
            } else {
                // write CA
                write_cert_metadata(metadata, "ca", output_path)?;
            }
        }
    },
    Err(err) => Err(ParserErrors::CertParseFail(format!("{err}")))?
}
idx_chain += 1;

This allow to write certificate metadata in the correct file. As defined earlier, the first cert is the end entity cert (or pre cert, if is_pre_cert is true) and other are CAs.

The write_cert_metadata() function match each type of certificate to create the output file path. Then, we call OpenOptions to be able to append each record on files without deleting their contents.

Next, we instanciate a new WriterBuilder providing our writer call from OpenOptions.

Finally, we write the record on the file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
fn write_cert_metadata(
    metadata: Metadata, 
    cert_type: &str, 
    output_path: &str, 
) -> ParserResult<()> {
    // get file path
    let output_file_path = match cert_type {
        "precert" => format!("{output_path}/{cert_type}.scsvh"),
        "cert" => format!("{output_path}/{cert_type}.scsvh"),
        "ca" => format!("{output_path}/{cert_type}.scsvh"),
        _ => return Err(ParserErrors::InvalidCertType(cert_type.to_string())),
    };

    let file = OpenOptions::new()
        .write(true)
        .append(true)
        .create(true)
        .open(output_file_path)
        .unwrap();

    let mut writer = csv::WriterBuilder::new()
        .quote_style(csv::QuoteStyle::Always)
        .delimiter(b';')
        .from_writer(file);

    writer.write_record(metadata.into_iter())
        .map_err(|e| ParserErrors::CsvWriter(format!("{e}")))?;

    Ok(())
}

As we implement into_iter() for Metadata we can use it on the write_record() method from the CSV crate that take an iterator in input.

Running the application to see if it works.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
cargo run

└─$ wc -l assets/test_dir/*
    1862 assets/test_dir/entries_209915904-209919999.csvh
    4097 assets/test_dir/entries_209920000-209924095.csvh
    4097 assets/test_dir/entries_209924096-209928191.csvh
   10056 total

└─$ wc -l output/*
   20674 output/ca.scsvh
    6457 output/cert.scsvh
    3596 output/precert.scsvh
   30727 total

The application successfully parse 10k certificate chains and a total of 30,700 certificates.

Adding an arguments parser

Now, I will add an arguments parser to be able to call the application and provide different I/O paths. There is crate that can be used to build an argument parser. This crate is call clap.

You can instanciate a parser using the Command::new() method and provide any usefull informations like authors, version and description of the application.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// Parse command arguments
let args: ArgMatches = Command::new("x509 Parser")
    .version("0.1.0")
    .author("Pezzz")
    .about("x509 Parser made for CT certificate")
    .arg(Arg::new("input")
            .short('i')
            .long("input")
            .num_args(1)
            .required(true)
            .help("Input data dir path"))
    .arg(Arg::new("output")
            .short('o')
            .long("output")
            .num_args(1)
            .required(true)
            .help("Output data dir path"))
    .get_matches();

let input_path = args.get_one::<String>("input").unwrap();
let output_path = args.get_one::<String>("output").unwrap();

You can specify, for each arguments, the number of element it wait, specify if the argument is required or not and more. You can get your arguments using args.get_one::<type>("<argument_name>").

Then, when launching the application, you can print help using the --help switch.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
target/release/x509-certificate-parser --help
x509 Parser made for CT certificate

Usage: x509-certificate-parser --input <input> --output <output>

Options:
  -i, --input <input>    Input data dir path
  -o, --output <output>  Output data dir path
  -h, --help             Print help
  -V, --version          Print version

Performance analysis

At that time, the execution time of the application, for an input directory of 5.4G (970,990 certificate chain), is:

  • 229.76s: with release mode
  • 224.31s: with release mode and LTO

It produce output files with more than 2,992,000 parsed certificates.

Info
Test are made into a virtual machine.

To more deeply analyse performance of the application, we will use the flamegraph visualization, a technique developed by Brendan Gregg, a Cloud computing performance engineer.

What is a flamegraph ?

A flamegraph is a type of visualization used to analyze and understand the performance characteristics of software systems. It provides a way to visualize the stack trace of a program, highlighting the most frequently executed functions and their relationship to each other.

A flamegraph is typically drawn as a horizontal bar chart, with each bar representing a function in the call stack. The width of each bar represents the amount of time spent in that function, while the position of the bar indicates its place in the call stack. The bars are ordered in descending order of their contribution to the overall execution time, with the most frequently executed functions at the top.

Generate a flamegraph

To generate my flamegraph I will use a Rust port of the Flamegraph Project.

First, build the inferno Rust project. It will generate binaries on the target/release directory.

1
2
3
4
# Build inferno
git clone https://github.com/jonhoo/inferno.git
cd inferno
cargo build --release
Tip

When doing performance tests, be sure to drop your cache or tests may be inexact.

You can do it with this command echo 3 > /proc/sys/vm/drop_caches

Warning
Be careful, try to not use that on production environment as it may cause unexpected behavior if there is files that are still in use.

We will use these two binaries: inferno-collapse-perf & inferno-flamegraph.

Now, we have to capture stack samples that will be used to create the flamegraph. Samples are used to determine what functions are being executed and how much time is being spent in each of them. To do so, we can use the linux perf command. Then, using inferno we will be able to create the graph from captured samples.

1
2
3
4
5
6
7
# Installing the perf packet
sudo apt install linux-tools

# create flamegraph using perf & inferno binaries
sudo perf record --call-graph dwarf x509-certificate-parser --input assets --output output
sudo perf script | ./inferno-collapse-perf > stacks.folded
cat stacks.folded | ./inferno-flamegraph > flamegraph.svg

These commands generate the flamegraph.svg file.

How to read a flamegraph ?

A flamegraph look like this. It is cool no ? :p

drawing

It allow us to see which function take the most of the execution time. Here we clearly see that our I/O functions that take the most of the execution time.

Our Reader takes 33% of the execution time while the Writer takes 21%.

drawing

We also see that fingerprint calculation takes almost 15% of the execution time.

The end…

I tried to explain Rust’s concepts that I learn and demonstrate these with the x509 Certificate parser.

Rust’s focus on memory safety and thread safety makes it an ideal choice for building high-performance, concurrent applications.

In our next article, we’ll explore Rust’s support for concurrency and parallelism, and show how we can use these features to further improve the performance of our x509 certificate parser.

📖 Bibliography

Other cool projects