Skip to main content

Command Palette

Search for a command to run...

Everything you need to know about YAML !

Updated
β€’9 min read
Everything you need to know about YAML !
S

✨Avid Learner πŸ‘¨β€πŸ’»Associate Programmer Analyst @ Moody's

In this blog, we are going to cover one of the most important topics related to DevOps which is YAML, a short form for "YAML ain't a markup language".

We will be seeing what is YAML, why need it, how it works, where is it needed and many other things.

What is YAML?

Before getting to know what YAML is, let us first discuss what a markup language is. An example of a markup language can be HTML (Hyper Text Markup Language), or XML (Extensible markup language). So a markup language, like HTML, gives us information about the structuring and formatting along with the parent-child relationship among its various elements. It controls how a document should look when rendered on the user's screen.

HTML IMAGE

Talking about YAML, which was previously known as "Yet Another Markup Language" is a data serialization language. It is a data format used to exchange data. Similar to JSON and XML data types. Mainly it is used to store some configuration information in an easy and human-readable format. YAML is not a programming language, rather it is a data serialization language.

Extension of the YAML file is either .yml OR .yaml

YAML is case-sensitive also.

Note: Each YAML document is separated using 3 dashes ( --- ) and to end a YAML document use 3 dots ( ... ).

Data Serialization and De-serialization

Consider an example where you have an object and you want to share that object in various locations, such as a mobile app, an ML model, or some other web app. We need some kind of data format that is shareable everywhere. Here comes the concept of serialization and deserialization.

Serialization is the process of converting an object present in the form of complex data structures into a stream of bytes which saves the state of the object and that can be transmitted to various devices easily. This stream can be stored in the database, XML or a YAML file.

The reverse of this is deserialization, where a stream of bytes is converted back to an object. One example of this can be when you make some API calls, you get JSON data in the form of a string which is converted to a JSON object using the JSON.parse() method.

In simple terms, data serialization languages are used to store the object in a shareable form that can be edited in the form of text. Later this file can be converted back to an object as and when needed.

But, why YAML Ain't Markup Language? πŸ€” - Because it not only stores documents like HTML but also stores data along with documents.

Uses of YAML

  • Used to create Docker/Kubernetes configuration files.

  • Used in logs, caches etc.

  • Used to store an object which needs to be transmitted on the network.

Why YAML?

  • Objects' states can be stored in a simple and easy-to-read format.

  • Easy to create YAML files.

  • It has a strict syntax, and can't make mistakes in it.

  • Easily convertible to other data serialization languages like JSON, XML etc.

  • Most programming languages use YAML.

  • Popular data serialization language, with community support.

  • More powerful when representing complex data.

  • Can use various tools with it like parsers.

  • Parsing (reading of data) is easy from YAML files.

YAML files in action!

Enough theory, now let us see the YAML files working in IDE and its different datatypes. Get your IDE ready !πŸ‘¨β€πŸ’»

Data representation in YAML

  1. Key-Value pairs

    The key-value pairs in YAML are not stored in hashmaps, rather it is just textual representation. We can use this file and convert a hashmap out of it.

    # Key value pairs in
    apple: "I am a red fruit"
    1: "This is my roll number"
    ---
    # Another representation of key value pairs to get rid of bad indentation
    { apple:"I am a red fruit",1: Mango }
    
  2. Lists

    Lists are represented using a dash ( - ).

    # Lists
    - apple
    - mango
    - papaya
    ...
    
  3. Block Style

    Here the key has a value on a list.

    # Block style representation
    cities:
      - mumbai
      - delhi
      - new york
    

    If you want to get rid of the bad indentations, then you can use the flow style representation to represent the list.

    # Flow style representation
    cities: [mumbai, delhi, new york]
    

Datatypes in YAML

In YAML, data types are detected automatically, but we can specify them ourselves also. For specifying the data type, type double exclamation marks before the value. Example:

marks: !!int 783
  1. String Data type

    # Variables in YAML
    name: Sagar Wadhwa
    # Sagar Wadhwa is a varibale
    ---
    name: !!str  "Sagar Wadhwa"
    fruit: 'apple'
    ---
    

    In the above code, the string "Sagar Wadhwa" can be either enclosed in double quotes, single quotes or not enclosed at all. And this string is known as a variable.

    If you try to store a string in multiple lines, the second or further parts of the line will be treated as separate elements without a key which will give an error.

    # This will give error, since second line is treated as a seperate element.
    # There should be a key value pair always which is not in the case of the 
    # second line.
    bio: Hey my name is sagar
    wadhwa and i am a good boy
    

    If we want to store the string value in multiple lines, and preserve all the indentations and newlines, use the | operator before typing the string.

    # To store this in seperate lines and preserve all the indentations and new 
    # lines use the pipe operator. Make sure to give indentation as well.
    bio2: |
      Hey my name is sagar 
      wadhwa and i am a good boy
    

    There might be a case when a string is too large to fit in a single line, and we want to type it in multiple lines but in actuality, the string should be of a single line only. For this, start the string with \>

    house: > 
      This will
      be in actual
      a single line only
    # same as
    house: This will be in actual a single line only
    
  1. Integer Data type

    Used to integer values, like 78.

    # Integer data type
    marks: 67
    # Can manually specify the data type also 
    rent: !!int 45
    
  2. Float Data type

    Used to store floating point values, like 6.64.

    # Float data type
    gpa: 8.83
    
    # Can manually specify the data type also 
    marks: !!float 56.67
    
    # Storing INFINITY
    infinity: !!float .inf # will be treated as null in JSON.
    
  3. Boolean Data type

    Use to store a value whose value is either True or False.

    # Boolean data type
    isEligible: !!bool Y # y, Yes, true, True, TRUE
    isAdmitted: N # n, NO, false, False, False
    
  4. Storing variables of other types of the number system and exponential values

    We can also store variables of other types of the number system like binary, hexadecimal and octal number systems.

    # Storing a Binary Value (starts with '0b')
    binaryNumber: !!binary 0b1101 
    # Storing a Hexadecimal Value (starts with '0x')
    hexadecimalNumber: !!int 0x45E
    # Storing an Octal value (starts with '0')
    octalNumber: !!int 05632 
    # Storing an exponential number
    bigNumber: 6.023E56 # represents 6.023 to the power of 56
    

    If you want to store the comma-separated values, then it can be done using the underscore symbol.

    commaValues: !!int +540_000 #540,000
    
    # For storing a value which is not a number 
    not a number: .nan # will be treated as null in JSON.
    
  5. NULL value

    Sometimes a key may have a NULL value or the key itself can be NULL. The representation of the same is given below.

    # NULL VALUE
    middleName: !!null null # null, NULL, ~
    # NULL KEY
    ~: "Example where key is null"
    
  6. Dates and Times

    For dates and times, we can specify the datatype manually by typing !!timestamp or simply just write the date and time.

    # specifying the time zone (IST), by default the time zone is UTC
    canonical:        2001-12-15T02:59:43.10 +5:30 # YYYY-MM-DD
    
    space separated:  2001-12-14 21:59:43.10 -5
    
    # no time zone
    no time zone: 2001-12-15 2:59:43.10
    
    # Simple date in the format of YYYY-MM-DD
    date: 2002-12-14
    

Advanced Datatypes in YAML

  1. Sequence / List

    student: !!seq
      - marks
      - name
      - rno
    ---
    #  Can be also stored as
    student: !!seq [marks, name, rno]
    ---
    # Sometimes the keys of the sequence can be empty also, this is known as # sparse sequence
    sparse sequence: !!seq
      - one: "apple"
      - two: "banana"
      - three: "mango"
      -
      -
      - four: "cabbage"
      - five: "tomato"
    ---
    
  2. Nested Sequence

    We can also have a list inside a list, known as a nested list or a sequence.

    - 
      # This will be one list inside a list
      - apple
      - banana
      - kiwi
    - 
      # This will be another list inside a list
      - sick
      - good
      - boy
      - he
    
  3. Key-Value Pairs [maps] and Nested Mappings

    We can have key-value pairs, known as maps, represented as !!map. There can be nested maps also (map within a map).

    # nested mapping 
    name: Sagar Wadhwa
    roles:
      # Another map 
      age: 20
      job: student
    ---
    # Above thing is same as this thing
    {name: Sagar Wadhwa, roles: {age: 20, job: student}}
    
  4. Pairs

    Used when a key has multiple values. These are stored as an array of hashtables.

    pairExample: !!pairs
      - job: Teacher
      - job: Student
    ---
    # Same as
    pairExample: !!pairs [job: Teacher, job: Student]
    
  5. Set

    It allows us to have unique values. No key-value pairs contain only items. Uses ?.

    names: !!set
      ? kunal
      ? sagar
      # ? sagar  will give error.
    
  6. Dictionary

    It is used when we want a key to have a value as a sequence. Can represent it manually by typing !!omap.

    people: !!omap
      - Sagar:
          - name: Sagar
          - age: 20
          - height: 34
      - Kunal:
          - name: Kunal
          - age: 26
          - height: 84
    

Reusing properties in YAML

If a particular property has to be used again and again for different keys then instead of typing it multiple times we can reuse the property from a main base key-value pair.

likings: 
  fruit: mango
  dislikes: grapes

person1:
  name: Sagar Wadhwa
  fruit: mango
  dislikes: grapes

person2:
  name: Rahul
  fruit: mango
  dislikes: grapes

person3:
  name: Rashi
  fruit: mango
  dislikes: grapes

# repetition of code, where the fruit and dislikes are being copied again and 
# again

In the above code, 3 persons have the same fruit and dislike value as the likings key's values. So it has been copied again and again. But instead of copying, we can reuse the value by specifying what value has to be taken as the base and copied.

This is done with the help of anchors (&like in the example below).

likings: &likes # This is anchor, can take any name, tells what to copied. 
  fruit: mango
  dislikes: grapes
person1:
  name: Sagar Wadhwa
  <<: *likes  # Anchor used, which specifies the maps whose value are 
              # specified should be inserted here
person2:
  name: Rahul
  <<: *likes 
  dislikes: berries # Can also override the value, like here dislikes are                
                    # overrided
person3:
  name: Rashi
  <<: *likes

The above code is the same as the previous code, but values need not be typed again and again.

Great! You made it to the end of this blog πŸŽ‰πŸŽ‰!

I hope that you were able to get an idea of what YAML files are and why they are needed. Don't forget to like this blog and share it with others. Thank You!