Data Transformations

1Introduction to Standalone Data Transformations in FCS Express

Data Transformations allow you to create custom parameters for k-means clustering, principal component analysis, parameter math, SPADE, tSNE/viSNE, as well as allow you to directly integrate R-scripts. The new transformation tools are built directly into FCS Express with an easy-to-use interface and drag and drop to plots capability.

2Introduction to Parameter Math

Parameter Math is a tool that allows you to create new parameters or change current parameters, based on mathematical combinations, or formula sequences, of the existing parameters within a data file.

3Introduction to Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations, in our case cytometry-based events, into new variables called principal components. The transformation is defined in such a way that the first two principal components generally define the maximum variance while each succeeding component maximizes the variance at a 90-degree rotation. Principal components defined and applied to a plot in FCS Express are accessible for analysis as new parameters on plots.

4Introduction to K-Means Cluster Analysis

k-means is a partitioning-based clustering algorithm. k-means method for clustering is an iterative process in which an initial partition of given k clusters is then improved by applying a search algorithm to the data.

Simplifying, given a pre-defined number (k) of clusters, the basic mechanism of the algorithm:

begins with an initial set of k cluster centers (i.e. the centroids),
assigns or re-assigns objects to the closest centroids,
recalculates centroids according to new memberships of the data points,
repeats the last two steps until a consistent result is found or until the maximum number of iterations is reached.

User Manual - K-Means (Cluster Analysis)

5Introduction to R Integration

The goal of R integration and implementation in FCS Express is to allow users to run their own R scripts and work on the resulting output directly within FCS Express. FCS Express currently supports two types of R scripts, one for adding new transformed parameters (R Add Parameters) and one for adding clustering assignment (R Cluster Transformation).

6Introduction to t-Distributed Stochastic Neighbor Embedding (t-SNE)

FCS Express integrates t-Distributed Stochastic Neighbor Embedding, otherwise known as t-SNE, which is a tool that allows you to map high-dimensional cytometry data onto a two-dimension plot while conserving the original high-dimensional structure to help you visualize and analyze high-dimensional data.

The final result of the algorithm in FCS Express is a 2D plot in which the positions of cells reflect their proximity in their original high-dimensional space. Plots can further be colored with density or heat mapping of each parameter allowing for easy visualization of populations.

7Introduction to SPADE

High-dimensional single-cell technologies, such as Flow, Mass, and Image cytometry, can measure dozens of parameters at the single-cell level. FCS Express integrates Spanning-tree Progression Analysis of Density-normalized Events, otherwise known as SPADE, which is a tool that extracts a hierarchy from high-dimensional cytometry data in an unsupervised manner and allows users to visualize multiple cell types in a branched tree structure without requiring the user to define a known cellular ordering.

The final result of the algorithm in FCS Express is a Heat Map plot in which each cell type is depicted as a node of the branched tree. The Heat Map can be formatted to color each node based on the expression of a given marker. The size of each node can be made proportional to a given statistic (e.g., the number of events within the node).

8Introduction to Pipelines

Pipelines are a set of data processing steps that stand alone or are connected in series. The output of a step can be applied to a data file or utilized as the input of the next step, or series of steps. Go from raw data to results step-by-step. Pipelines provide the control and flexibility you need for data processing.

Pipelines enable users to utilize the flexible and intuitive interface of FCS Express to perform advanced data analysis and processing steps, without the need for external applications such as R or Python. Users no longer have to write complex programming scripts or rely on plugins. With FCS Express you have access to many of the most common and cutting-edge data analysis tools directly in the software.

9Introduction to FlowSOM

FlowSOM is a clustering and visualization tool that facilitates the analysis of high-dimensional data.

Clusters are arranged via a Self-Organizing Map (SOM) in a Minimum Spanning Tree, that can help FCS Express users understand how markers included in their panel are behaving on all cells.

A second clustering step (i.e., meta-clustering) is performed, which can provide a basis for discerning biological similarity and can help detect groups that may have otherwise been missed.

10Introduction to FlowAI

FlowAI is an automatic and interactive tool to remove outliers and anomalies from your data files, essentially "cleaning" your data.

The algorithm removes events with anomalous values by taking into account three aspects of a flow cytometry data file:

Flow rate
Signal acquisition
Dynamic range

Removing these lesser quality events allows for improved downstream basic and advanced analysis while generating superior, reliable, and reproducible results in FCS Express.

11Introduction to FlowCut

FlowCut allows the user to perform quality control on flow cytometry data in order to improve both manual and automated downstream analysis.

FlowCut is a two step process:

It first removes acquisition regions with fewer events, then
It cleans the remaining data based on multiple quality control tests.

User Manual - FlowCut Pre-defined Algorithm

12Introduction to UMAP

UMAP, short for Uniform Manifold Approximation and Projection, is a dimensionality reduction technique that constructs a high dimensional graph representation of the data then creates a low-dimensional graph to be as structurally similar as possible.

This results in the creation of two new parameters, UMAP 1 and UMAP 2.

UMAP captures local relationships within a cluster as well as global relationships between distinct clusters.

13Python Transformation

The Python Transformation pipeline step allows FCS Express and Python to communicate with each other through a text script that can be created directly within FCS Express.

Primary Navigation