#Help Page

#Background

Thermodynamic analysis tools such as the Nucleic Acid Package (NUPACK) [1] can calculate the equilibrium concentrations of DNA complexes given the concentrations of the constituent DNA strands. NUPACK first determines the free energy of formation of complexes based on the strands’ sequences, and then determines the equilibrium concentration of each complex based on the complex’s free energy. However, the calculation of the complex free energy is generally substantially slower than the computation of the equilibrium concentrations, which is a well-studied form of convex optimization with fast convergence. Thus, significant speedups can be achieved if complex free energy for relevant complexes is known beforehand, for example by empirical measures or by approximations like domain-level DNA hybridization assumptions.

#Summary

Concentrat.io computes free energies and concentrations of domain-level DNA designs. At this high level of abstraction, our tool allows arbitrary connectivity between the bound strands, including pseudoknots, albeit without penalizing geometrically infeasible configurations. The model is shared by thermodynamic binding networks [2].

#Input Syntax

#Specification of monomers and their concentrations

Each monomer (strand) is specified on a separate line in Monomers and Concentrations. The binding sites (domains) are space separated. The list of binding sites is followed by a comma, then the monomer concentration.

a b, 10
a* b*, 70.9    # This is a comment
a, 50
b, 50

The concentration units are specified by the Concentration Units dropdown menu (mM, µM, nM, or pM).

Monomers (strands) can be optionally named by specifying a label followed by a colon.

strand1: a b, 100
a* b*, 50

#Binding site free energies

The field Default Binding Energy specifies the default binding free energy (kcal/mol) of all binding sites (domains). To specify different energies for different binding sites, you can use the optional text field Site Binding Energies as follows.

a = -10.5
b = -20

Note that all binding sites not listed in the Site Binding Energies field receive the default energy from Default Binding Energy (default -20.6 kcal/mol).

Note that the binding site free energies (whether default or explicitly specified) do not get adjusted with varying Temperature.

#Complex Enumeration

We say a complex is splittable if it can be partitioned into two complexes while maintaining the same bonds. For example, consider the complex consisting of the following monomers: a b, a* b*, and a. This complex can be split into two complexes [a b, a* b*] and [a] while maintaining one a-a* bond and one b-b* bond. Thus the original complex is splittable.

If Maximum Complex Size is set to infinity (the default setting), then concentrat.io will enumerate all unsplittable complexes. Note that there is a mathematical correspondence between the unsplittable complexes and the Hilbert basis of the corresponding linear problem [4]. This proves that the number of unsplittable complexes is always finite.

While the number of unsplittable complexes is finite, it may be very large. Setting Maximum Complex Size to a positive integer restricts the maximum size (number of monomers) in the unsplittable polymers enumerated.

#Free Energy Calculation

For any complex, its free energy is computed as follows:

$\Delta G = \Delta H - T \cdot k_B \cdot \ln(R) + \Delta G^\text{assoc} \cdot (L - 1)$

where $T$ is the temperature (K), $k_B$ is Boltzmann's constant (0.001987204259 kcal/mol/K).

$\Delta H$ is the total binding energy of the complex (kcal/mol). In other words, it is the sum of the binding energies of all the bound domains. Note that the domain binding energy is directly taken from Default Binding Energy or from its specification in Site Binding Energies without any temperature adjustment (i.e., we assume that these values are already temperature-adjusted when specified).

Complexes of more than one strand are penalized by $\Delta G^\text{assoc}$ (ΔG of association) for each additional strand. Here $L$ is the number of strands in the complex. $\Delta G^\text{assoc}$ is computed in exactly the same way as in Nupack 3 for DNA parameters (including temperature adjustment), except that no salt correction is applied.

$R$ is the number of microstates corresponding to different equivalent permutations of strands in the complex. Specifically, if this complex has count $n_i$ of strand $i$ , then $R = \prod_i n_i!$ .

As a concrete example, consider the complex consisting of two monomers x x and one monomer x* x* x* x*. Let us assume Default Binding Energy of -20.6 kcal/mol. Then for this complex, $\Delta H = -20.6 \cdot 4$ kcal/mol, $L = 3$ , and $R = 2$ .

Note that while we correct for multiple copies of the same strand in a complex, we do not consider multiple possible ways of making bonds within a complex. This choice is due to a priori not knowing which of these ways is geometrically feasible.

#Tutorial

The following system appears in [2] (Figure 3.3). Paste the following into Monomers and Concentrations:

# This system acts like an AND gate with
# the following two strands as input
input1: a1 a2, 100
input2: b1 b2, 100
a1* a2* b1* b2*, 100
a1 a2 b1 b2 c1, 100
a2* b1* b2* c1*, 100
a2 b1, 100
b2 c1 c2, 100
c1* c2*, 100
output: c1 c2, 100   # This strand is the output

Leave the other fields at their default values.

Clicking on Compute Energies shows the histogram and table of the free energies of the enumerated 54 unsplittable complexes. Clicking on Calculate Concentrations returns the concentrations of these complexes in the table.

The search box above the table of concentrations filters the table to show only the complexes containing the monomers listed in the search box (comma separated). Try typing output in the search box. We can see that 36nM of the output monomer is free while ~64nM is together in the complex with c1* c2*.

Now try commenting out the lines with the input monomers:

input1: a1 a2, 100
input2: b1 b2, 100

Recomputing the concentrations shows that now almost all of the output monomer is together with c1* c2*.

#API Documentation

Concentrat.io functionality is also exposed via a Web API. The following example demonstrates how to use Python code to compute concentrations:

import requests

# Define the API endpoint
url = 'https://concentrat.io/api/calculate/concentrations'

# Set the headers to send and accept JSON
headers = {
    'Content-Type': 'application/json'
}

# Define the data to be sent in the POST request
data = {
    "max_complexes": 3,
    "temperature": 25,
    "max_complex_energy": 0,
    "concentration_unit": "nM",
    "binding_energy": -20,
    "energies_inputs": "",
    "monomers": [
        ["a b", 10], 
        ["a* b*", 10], 
        ["a", 10], 
        ["b", 10]
    ]
}

# Make the POST request
response = requests.post(url, headers=headers, json=data).json()

# Print the response from the server (json)
print(response)

The response contains both the input as well as the output energies and concentrations of complexes. For example, response["complexes"][0] gets the 0th complex (highest concentration) which in this case is:

{
    'monomers': [[0, 1], [1, 1]], 
    'free_energy': -40.48608306337398, 
    'concentration': 9.999801e-09
}

The monomers field represents the complex as an array of ([monomer index], [number]). Thus this complex has one monomer "a b" and one monomer "a* b*".

#References

[1] J. N. Zadeh, C. D. Steenberg, J. S. Bois, B. R. Wolfe, M. B. Pierce, A. R. Khan, R. M. Dirks, N. A. Pierce. NUPACK: analysis and design of nucleic acid systems. J Comput Chem, 32:170–173, 2011.

[2] K. Breik, C. Thachuk, M. Heule, D. Soloveichik. Computing properties of stable configurations of thermodynamic binding networks. Theoretical Computer Science 20;785:17-29, 2019.

[3] J. Petrack, D. Soloveichik, D. Doty. Thermodynamically Driven Signal Amplification. DNA Computing and Molecular Programming 29 (DNA29), 2023.

[4] D. Haley, D. Doty. Computing properties of thermodynamic binding networks: An integer programming approach. arXiv preprint arXiv:2011.10677, 2020.