Crude Blending — Gaining a Competitive Edge in the Oil & Gas Industry through Linear Programming

Published in

Analytics Vidhya

6 min readDec 30, 2020

Image Courtesy: https://finfeed.com/features/oil-and-gas-juniors-looking-goods/

The Oil & Gas sector is globally regarded as the 3rd largest industry after Health Insurance and Pension Funds, generating a revenue close to USD $3.3 trillion over the last 5 years. It is widely considered as the lifeline to sustain the businesses of major industries such as Defense, Aviation and Transportation.

Crude oil, commonly known as “black gold” is a scarce resource which when carefully distilled in a refinery passing through series of chemical processes, synthesizes to produce useful products that run our daily lives such as Petrol (or Motor Spirit), Diesel, Kerosene, Paraffins, Jet Fuel, etc. It is due to this global scarcity of crude combined with several geographic, technical and socio-political challenges of setting up complexed processes in a refinery that makes oil companies around the globe to lock horns with each other in order to gain worldwide dominance.

Overview of Crude Blending

One of the ways for an already constrained organization, to gain a competitive edge over other candidates in the Oil & Gas industry is through ‘crude blending’ or simply stated, the process of combining different grades of crude oil in varying ratios to produce products for distribution by maximizing it’s profit margin and satisfying their customers’ daily demand without compromising the quality of product or causing any significant impact to the environment. Through this article, we will try to gain a basic step-by-step understanding of how blending operations in a refinery can be modelled as a linear programming problem in Python and solve it to obtain the optimal blending proportions.

Defining a problem for a fictitious oil refinery

Importing required libraries.

import pandas as pd
import numpy as np
import itertools
from tqdm import tqdm
from pulp import *
import matplotlib.pyplot as plt
from IPython.core.display import display, HTMLdef display_side_by_side(dfs:list, captions:list):
    """Display tables side by side to save vertical space
    Input:
        dfs: list of pandas.DataFrame
        captions: list of table captions
    """
    output = ""
    combined = dict(zip(captions, dfs))
    for caption, df in combined.items():
        output += df.style.set_table_attributes("style='display:inline'").set_caption(caption)._repr_html_()
        output += "\xa0\xa0\xa0"
    display(HTML(output))import warnings
warnings.filterwarnings("ignore")

All data and code used in this article can be accessed through my GitHub repository: https://github.com/mohiteprathamesh1996/Crude-Blending.git

Let’s read the files have a look at the data,

raw_crude_df = pd.read_excel("Crude Blending.xlsx", sheet_name="Raw Crude")
raw_crude_df.set_index(["Crude Type"], inplace=True)products_df = pd.read_excel("Crude Blending.xlsx", sheet_name="Products")
products_df.set_index(["Product Name"], inplace=True)# Display data
display_side_by_side(dfs=[raw_crude_df, products_df], 
                     captions=["Raw Crude", "Blended Products from Raw Crude"])

We can see we that the oil refinery produces 3 types of petroleum products namely Fuel_1, Fuel_2 and Fuel_3 by blending 5 different types of crude oil i.e. West Texas Intermediate, Brent, Saharan Blend and BCF-17. Through a linear programming approach, we have to build a model that maximize the net operational revenue of the refinery’s operations.

Decision Variables

Let us define the decision variable

X(c, p) = Number of barrels of crude type ‘c’ required to make petroleum product ‘p’ for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

# List of crude types
crude_types = raw_crude_df.index.to_list()# List of petroleum products
products = products_df.index.to_list()# Dictionary of decision variables
var_dict = LpVariable.dicts("Blends", 
[(crude, product) 
for crude in crude_types 
        for product in products], 
                            lowBound=0, 
                            cat="Integer")# Display decision variables
print(var_dict)

Defining the Objective Function

Framing the above LP as a maximization problem, we can define out objective function as :

Maximize, Z = Selling Price — (Purchasing Cost + Conversion cost)

Where,

Purchasing Cost for each type of crude oil = Σ Cost(c)*Σ(X(c, p)) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

Assuming a $4 cost of converting each barrel of crude oil to finished product,

Conversion cost = Σ(X(c, p)) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

Selling Price for each finished petroleum product = Σ Cost(p)*Σ(X(c, p)) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

# Model initialization
model = LpProblem("Optimized Blending of Crude Oil Types", LpMaximize)# Objective Function
purchasing_cost = lpSum(
[raw_crude_df.loc[c, "Purchasing Price Per Barrel"] * var_dict[(c,p)]for c in crude_types for p in products])conversion_cost = lpSum(
[4 * var_dict[(c, p)] for c in crude_types for p in products])selling_price = lpSum(
[products_df.loc[p, "Selling Price per Barrel"] * var_dict[(c, p)] \
        for p in products for c in crude_types])
model += selling_price - (purchasing_cost + conversion_cost)

Setting Constraints

Let’s say that the above refinery has budget to purchase at most 7,500 barrels per day of each type of crude oil, i.e.

ΣX(c, p1)+ΣX(c, p2)+….+ΣX(c, pN) ≤ 7,500 for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

#1 Upto 7500 barrels of each crude type can be purchased per day
for c in crude_types:
    model += lpSum([var_dict[(c, p)] for p in products]) <= 7500

Next, we must ensure that each blended finished product has to satisfy the minimum Octane rating, i.e.

Σ(Octane Rating (c)*X(c, p)) ≥ Octane Rating (p)*ΣX(c,p) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

#2 Fuel quality based on Octane number
for p in products:
    model += lpDot(
[var_dict[(c,p)] for c in crude_types],
[raw_crude_df.loc[c, "Octane Number"] for c in crude_types])\
               >= products_df.loc[p, "Octane Number"] * lpSum([var_dict[(c, p)] for c in crude_types])

Further, the refinery is constrained by its labor laws and other internal logistical hassles to produce at most 50,000 barrels per day of finished products, i.e.

ΣX(c,p) ≤ 50,000 for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

#3 Maximum production capacity of 50000 barrels per day
model += lpSum(
[var_dict[(c, p)] for c in crude_types for p in products]) <= 50000

The refinery must also satisfy the daily customer demand of specific number of barrels per day, i.e.

ΣX(c1, p)+ΣX(c2, p)+….+ΣX(cN, p) = Demand(p) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

#4 Fulfill daily customer demand 
for p in products:
    model += lpSum([var_dict[(c,p)] for c in crude_types]) == products_df.loc[p, "Demand"]

Lastly, the finished product must adhere to certain environmental norms so as to limit it’s sulfur concentration, i.e.

Σ(Sulfur Content(c)*X(c, p)) ≤ Sulfur Content(p)*ΣX(c,p) for all c∈(West Texas Intermediate, Brent, Saharan Blend and BCF-17) and p∈(Fuel_1, Fuel_2 and Fuel_3)

#5 Sulfur content limitations
for p in products:
    model += lpDot(
[var_dict[(c,p)] for c in crude_types],
[raw_crude_df.loc[c, "Sulfur content (%)"] for c in crude_types])\
               <= products_df.loc[p, "Allowable Sulfur (%)"] * lpSum([var_dict[(c, p)] for c in crude_types])

Optimal Solution

Last step, we solve the above LP problem and obtain the optimal blending proportions for each type of crude oil to produce finished products.

# Solve the mode
model.solve()# Saving the optimal solution results
if LpStatus[model.status]=="Optimal":
    optimal_soln = pd.DataFrame(
[(v.name, 
int(v.varValue)) for v in model.variables() if v.varValue!=0],
                        columns=["Decisions", "Number of Barrels"])

The optimal solution obtained is as follows.

print("Total Profit : USD $ {}".format(round(value(model.objective), 2)))display_side_by_side(
    dfs=[round(100 * np.divide(optimal_soln[optimal_soln["Decisions"].str.contains(p)]\
                   .set_index("Decisions"),
                   optimal_soln[optimal_soln["Decisions"].str.contains(p)]\
                   .set_index("Decisions").sum()), 2).rename(columns={"Number of Barrels":"Blend_Perc (%)"})\
         for p in products], 
    captions=["Optimal Blending Proportions for "+i for i in products])

References

I really hope you have enjoyed reading this article. Feel free to drop in your comments or suggestions. Thanks.

Let’s connect on LinkedIn: https://www.linkedin.com/in/prathameshmohite96/