Virtual Output Queues in Distributed Networking Systems: An Overview

Jeswin Augustine February 12, 2023

In a distributed networking system, data transmission between nodes can become congested due to limited bandwidth and high traffic volume. To manage this congestion and ensure efficient data transmission, virtual output queues (VOQs) are utilized. In this blog, we will explain what virtual output queues are and how they work in distributed networking systems.

What are Virtual Output Queues?

Virtual output queues are a method used in computer networks to manage congestion by distributing incoming data packets across multiple output queues at the switch. Each queue corresponds to a different output port, and the data packets are stored in the queue until they can be transmitted. This helps to distribute the load evenly across multiple output ports and prevent congestion at a single port.

Why are Virtual Output Queues used in Distributed Networking Systems?

In a distributed networking system, there are multiple nodes that communicate with each other. When many nodes are transmitting data at the same time, the network can become congested, and data transmission can become slow or even come to a halt. VOQs are used in these systems to help manage the congestion and ensure efficient data transmission.

How do Virtual Output Queues work in Distributed Networking Systems?

The following diagram illustrates how VOQs work in a distributed networking system:

Data packets are received at the switch: When data packets are received at the switch, they are stored in the input buffer.
Data packets are assigned to an output queue: The switch uses a scheduling algorithm to determine which output queue the data packets should be sent to. The algorithm considers factors such as the available bandwidth and the amount of data in each queue.
Data packets are transmitted: Once the data packets are assigned to an output queue, they are transmitted from the switch to the appropriate node. The data packets are transmitted in the order in which they were received, ensuring that the most recently received data packets are transmitted first.
Feedback from the nodes: The nodes provide feedback to the switch, indicating the amount of available bandwidth and the amount of data in each queue. This information is used by the scheduling algorithm to determine how to distribute the data packets in the future.

Benefits of Using Virtual Output Queues in Distributed Networking Systems

Improved bandwidth utilization: VOQs help to distribute the load evenly across multiple output ports, improving the utilization of available bandwidth.
Reduced congestion: By distributing the load evenly, VOQs help to reduce congestion at a single port, ensuring efficient data transmission.
Improved fairness: VOQs help to ensure fair distribution of available bandwidth among the nodes, preventing a single node from monopolizing the bandwidth.

In conclusion, virtual output queues are an important method used in distributed networking systems to manage congestion and ensure efficient data transmission. By distributing incoming data packets across multiple output queues, VOQs help to improve bandwidth utilization, reduce congestion, and ensure fair distribution of available bandwidth.

Categories: Programming

Network data pipeline of Broadcom Jericho

Jeswin Augustine February 11, 2023

Introduction:

Network data pipelines play a crucial role in ensuring the efficient and seamless transfer of data over a network. Broadcom Jericho-based switches are a popular choice in modern data center networks due to their advanced data pipeline technology. In this blog post, we will take a closer look at the network data pipeline of a Broadcom Jericho-based switch and how it operates.

The Broadcom Jericho Data Pipeline:

The Broadcom Jericho data pipeline is a multi-stage process that consists of several components working together to transfer data over a network. The following diagram illustrates the main components of the Broadcom Jericho data pipeline:

Input Ports: The input ports are responsible for receiving incoming data from other network devices. They have the capability to detect the incoming data speed and adjust accordingly.
MAC Layer: The MAC (Media Access Control) layer is responsible for controlling access to the shared media. It handles the framing and error correction of incoming data and ensures that it is transmitted in a format that the other components in the pipeline can understand.
Switch Fabric: The switch fabric is responsible for forwarding the data from one port to another within the switch. It operates in parallel to ensure that data is transmitted as quickly as possible.
Output Ports: The output ports are responsible for transmitting the data to the other network devices. They are capable of adjusting the transmission speed to match that of the incoming data.
Table Lookup: The table lookup component is responsible for determining the destination of the incoming data. It uses a forwarding table to determine the next hop for the data based on the destination address.
Quality of Service (QoS): The QoS component is responsible for prioritizing the different types of data based on their importance. This ensures that critical data is transmitted first, reducing the likelihood of congestion and delay.
Security: The security component is responsible for implementing various security measures to protect the data being transmitted. This can include firewalls, intrusion detection, and encryption.

Conclusion:

The Broadcom Jericho data pipeline is a complex but highly efficient system that enables fast and secure data transfer over a network. Its multi-stage architecture and advanced components ensure that data is transmitted quickly and accurately, helping to keep modern data centers running smoothly. Whether you are an IT professional or just curious about network technology, understanding the Broadcom Jericho data pipeline is an important step towards a better understanding of how data is transmitted over a network.

Categories: Programming

HackerRank – Array Manipulation

Jeswin Augustine August 26, 2020

Problem

Starting with a 1-indexed array of zeros and a list of operations, for each operation add a value to each the array element between two given indices, inclusive. Once all operations have been performed, return the maximum value in the array.

Example

Queries are interpreted as follows:

Add the values of between the indices and

inclusive:

index->	 1 2 3  4  5 6 7 8 9 10
	[0,0,0, 0, 0,0,0,0,0, 0]
	[3,3,3, 3, 3,0,0,0,0, 0]
	[3,3,3,10,10,7,7,7,0, 0]
	[3,3,3,10,10,8,8,8,1, 0]

The largest value is

after all operations are performed.

Function Description

Complete the function arrayManipulation in the editor below.

arrayManipulation has the following parameters:

int n – the number of elements in the array
int queries[q][3] – a two dimensional array of queries where each queries[i] contains three integers, a, b, and k.

Returns

int – the maximum value in the resultant array

Input Format

The first line contains two space-separated integers and , the size of the array and the number of operations.
Each of the next lines contains three space-separated integers , and

, the left index, right index and summand.

Constraints

Sample Input

Sample Output

Explanation

After the first update the list is 100 100 0 0 0.
After the second update list is 100 200 100 100 100.
After the third update list is 100 200 200 200 100.

The maximum value is 200

Solution

After contemplating the popular approach for solving this, here is how I wrapped my head around it.

For every input line of a-b-k, you are given the range (a to b) where the values increase by k. So instead of keeping track of actual values increasing, just keep track of the rate of change (i.e. a slope) in terms of where the rate started its increase and where it stopped its increase. This is done by adding k to the “a” position of your array and adding -k to the “b+1” position of your array for every input line a-b-k, and that’s it. “b+1” is used because the increase still applied at “b”.

The maximum final value is equivalent to the maximum accumulated “slope” starting from the first position, because it is the spot which incremented more than all other places. Accumulated “slope” means to you add slope changes in position 0 to position 1, then add that to position 2, and so forth, looking for the point where it was the greatest. This was suggested by richardpvogt.

Code

long arrayManipulation(int n, int queries_rows, int queries_columns, int** queries) {

    
    long from=0,to=0,val=0,max=0,new_val=0;
    long *mat = (long *) malloc((n+1) * sizeof(long));
    for(int i=0;i<=n;i++){
      *(mat+i)=0;
    }
    
    for (int i = 0;i<queries_rows;i++) {
        from = queries[i][0];
        to = queries[i][1];
        val = queries[i][2];    
        
        printf("from-%d, to-%d, val-%d\n",from,to,val);   
        
        *(mat+from) += val;
        printf("*(mat+from) - %ld\n",*(mat+from));    
        if(to+1 <= n){
        *(mat+to+1) -= val;
        printf("*(mat+to+1) - %ld\n",*(mat+to+1));   
        } 

    }   
    
    for(int i=1;i<=n;i++){
        new_val += *(mat+i);
        printf("%ld\n",new_val);
        if(max<=new_val){
            max=new_val;
        }
    }
    
    return max;

}

int main()
{
    FILE* fptr = fopen(getenv("OUTPUT_PATH"), "w");

    char** first_multiple_input = split_string(rtrim(readline()));

    int n = parse_int(*(first_multiple_input + 0));

    int m = parse_int(*(first_multiple_input + 1));

    int** queries = malloc(m * sizeof(int*));

    for (int i = 0; i < m; i++) {
        *(queries + i) = malloc(3 * (sizeof(int)));

        char** queries_item_temp = split_string(rtrim(readline()));

        for (int j = 0; j < 3; j++) {
            int queries_item = parse_int(*(queries_item_temp + j));

            *(*(queries + i) + j) = queries_item;
        }
    }

    long result = arrayManipulation(n, m, 3, queries);

    fprintf(fptr, "%ld\n", result);

    fclose(fptr);

    return 0;
}

char* readline() {
    size_t alloc_length = 1024;
    size_t data_length = 0;

    char* data = malloc(alloc_length);

    while (true) {
        char* cursor = data + data_length;
        char* line = fgets(cursor, alloc_length - data_length, stdin);

        if (!line) {
            break;
        }

        data_length += strlen(cursor);

        if (data_length < alloc_length - 1 || data[data_length - 1] == '\n') {
            break;
        }

        alloc_length <<= 1;

        data = realloc(data, alloc_length);

        if (!data) {
            data = '\0';

            break;
        }
    }

    if (data[data_length - 1] == '\n') {
        data[data_length - 1] = '\0';

        data = realloc(data, data_length);

        if (!data) {
            data = '\0';
        }
    } else {
        data = realloc(data, data_length + 1);

        if (!data) {
            data = '\0';
        } else {
            data[data_length] = '\0';
        }
    }

    return data;
}

char* ltrim(char* str) {
    if (!str) {
        return '\0';
    }

    if (!*str) {
        return str;
    }

    while (*str != '\0' && isspace(*str)) {
        str++;
    }

    return str;
}

char* rtrim(char* str) {
    if (!str) {
        return '\0';
    }

    if (!*str) {
        return str;
    }

    char* end = str + strlen(str) - 1;

    while (end >= str && isspace(*end)) {
        end--;
    }

    *(end + 1) = '\0';

    return str;
}

char** split_string(char* str) {
    char** splits = NULL;
    char* token = strtok(str, " ");

    int spaces = 0;

    while (token) {
        splits = realloc(splits, sizeof(char*) * ++spaces);

        if (!splits) {
            return splits;
        }

        splits[spaces - 1] = token;

        token = strtok(NULL, " ");
    }

    return splits;
}

int parse_int(char* str) {
    char* endptr;
    int value = strtol(str, &endptr, 10);

    if (endptr == str || *endptr != '\0') {
        exit(EXIT_FAILURE);
    }

    return value;
}

Categories: Programming

Logistic Regression Code – Telecom Churn Example

Jeswin Augustine March 22, 2019

Notice: Undefined offset: 0 in /home/jeswin/public_html/www.jeswin.com/wp-content/plugins/ff-block-gist-embed/src/render.php on line 50

Lets explore logisitic regression code done in python today. We have a dataset available for sample telecom provided where we have data of its customer who may or may not churn.

We have to make a prediction on the data set as accurately as possible.

Lets see how we can do that !

Categories: Machine Learning, Programming, Python Tags: Tags: logistic regression, machine learning, python code logistic regression

Multi-Linear Regression code – USA housing data set

Jeswin Augustine March 7, 2019

Notice: Undefined offset: 0 in /home/jeswin/public_html/www.jeswin.com/wp-content/plugins/ff-block-gist-embed/src/render.php on line 50

Yet another Linear regression code for US housing dataset.

Dataset was taken from : https://www.kaggle.com/huzaifsayyed/us-housing-data

Categories: Programming

Multiple Linear Regression – Python code on housing case study

Jeswin Augustine March 2, 2019

Notice: Undefined offset: 0 in /home/jeswin/public_html/www.jeswin.com/wp-content/plugins/ff-block-gist-embed/src/render.php on line 50

Here is another easy to follow code for Multiple Linear regression code on housing data !

Categories: Machine Learning, Programming, Python Tags: Tags: linear regression, machine learning, python code linear regression

Simple Linear Regression – Python code

Jeswin Augustine February 28, 2019

Notice: Undefined offset: 0 in /home/jeswin/public_html/www.jeswin.com/wp-content/plugins/ff-block-gist-embed/src/render.php on line 50

Here is sample code for Simple Regression that you can easily follow !

Github link : https://github.com/jeswinaugustine/machine_learning_code/blob/master/Linear%20regression/Simple_Linear_Regression.ipynb

Categories: Machine Learning, Programming, Python Tags: Tags: linear regression, machine learning, python code linear regression

Python Basics – Strings !

Jeswin Augustine February 21, 2019

Notice: Undefined offset: 0 in /home/jeswin/public_html/www.jeswin.com/wp-content/plugins/ff-block-gist-embed/src/render.php on line 50

I have a compiled a basic jupyter notebook listing some basic introduction to python and its string operations. This is only for quick reference !

Categories: Programming, Python Tags: Tags: introduction to python, python, python strings

Linear Regression Interview Questions – Part 2

Jeswin Augustine January 2, 2019

In the previous post, you saw some common interview questions asked on linear regression. The questions in that segment were mostly related to the essence of linear regression and focused on general concepts related to linear regression. This section extensively covers the common interview questions asked related to the concepts learnt in multiple linear regression.

Q1. What is Multicollinearity? How does it affect the linear regression? How can you deal with it?

Multicollinearity occurs when some of the independent variables are highly correlated (positively or negatively) with each other. This multicollinearity causes a problem as it is against the basic assumption of linear regression. The presence of multicollinearity does not affect the predictive capability of the model. So, if you just want predictions, the presence of multicollinearity does not affect your output. However, if you want to draw some insights from the model and apply them in, let’s say, some business model, it may cause problems.

Linear Regression Interview Questions – Part 1

Jeswin Augustine December 10, 2018

It is a common practice to test data science aspirants on linear regression as it is the first algorithm that almost everyone studies in Data Science/Machine Learning. Aspirants are expected to possess an in-depth knowledge of these algorithms. We consulted hiring managers and data scientists from various organisations to know about the typical Linear Regression questions which they ask in an interview. Based on their extensive feedback a set of question and answers were prepared to help students in their conversations.

Q1. What is linear regression?

In simple terms, linear regression is a method of finding the best straight line fitting to the given data, i.e. finding the best linear relationship between the independent and dependent variables.