Read FCS files#
In this notebook, we load an fcs file into the anndata format, move the forward scatter (FCS) and sideward scatter (SSC) information to the .obs
section of the anndata file and perform compensation on the data.
import readfcs
import pytometry as pm
Read data from readfcs
package example. The fcs file was part of the following reference and originally deposited on the FlowRepository.
path_data = readfcs.datasets.Oetjen18_t1()
adata = pm.io.read_fcs(path_data)
adata
AnnData object with n_obs × n_vars = 241552 × 20
var: 'n', 'channel', 'marker', '$PnR', '$PnB', '$PnE', '$PnV', '$PnG'
uns: 'meta'
The .var
section of the AnnData object contains the channel information. We set the marker names as var_names
by default. In addition, we save the channel information in the "channel"
column.
adata.var
n | channel | marker | $PnR | $PnB | $PnE | $PnV | $PnG | |
---|---|---|---|---|---|---|---|---|
FSC-A | 1 | FSC-A | 262144 | 32 | 0,0 | 510 | 1.0 | |
FSC-H | 2 | FSC-H | 262144 | 32 | 0,0 | 510 | 1.0 | |
FSC-W | 3 | FSC-W | 262144 | 32 | 0,0 | 510 | 1.0 | |
SSC-A | 4 | SSC-A | 262144 | 32 | 0,0 | 310 | 1.0 | |
SSC-H | 5 | SSC-H | 262144 | 32 | 0,0 | 310 | 1.0 | |
SSC-W | 6 | SSC-W | 262144 | 32 | 0,0 | 310 | 1.0 | |
CD95 | 7 | R660-A | CD95 | 262144 | 32 | 0,0 | 490 | 1.0 |
CD8 | 8 | R780-A | CD8 | 262144 | 32 | 0,0 | 475 | 1.0 |
CD27 | 9 | B515-A | CD27 | 262144 | 32 | 0,0 | 470 | 1.0 |
CXCR4 | 10 | B710-A | CXCR4 | 262144 | 32 | 0,0 | 417 | 1.0 |
CCR7 | 11 | V450-A | CCR7 | 262144 | 32 | 0,0 | 400 | 1.0 |
LIVE/DEAD | 12 | V545-A | LIVE/DEAD | 262144 | 32 | 0,0 | 495 | 1.0 |
CD4 | 13 | V605-A | CD4 | 262144 | 32 | 0,0 | 400 | 1.0 |
CD45RA | 14 | V655-A | CD45RA | 262144 | 32 | 0,0 | 375 | 1.0 |
CD3 | 15 | V800-A | CD3 | 262144 | 32 | 0,0 | 400 | 1.0 |
CD49B | 16 | G560-A | CD49B | 262144 | 32 | 0,0 | 400 | 1.0 |
CD14/19 | 17 | G610-A | CD14/19 | 262144 | 32 | 0,0 | 415 | 1.0 |
CD69 | 18 | G660-A | CD69 | 262144 | 32 | 0,0 | 470 | 1.0 |
CD103 | 19 | G780-A | CD103 | 262144 | 32 | 0,0 | 435 | 1.0 |
Time | 20 | Time | 262144 | 32 | 0,0 | 0.01 |
The .uns['meta']
section contains the header information from the FCS file.
adata.uns["meta"]
{'__header__': {'FCS format': 'FCS3.0',
'text start': 256,
'text end': 6333,
'data start': 6339,
'data end': 19330498,
'analysis start': 0,
'analysis end': 0},
'$BEGINANALYSIS': '0',
'$ENDANALYSIS': '0',
'$BEGINSTEXT': '0',
'$ENDSTEXT': '0',
'$BEGINDATA': '6339',
'$ENDDATA': '19330498 ',
'$FIL': '2-13-17 T cell Panel_T_E_G05_004.fcs',
'$SYS': 'Windows 7 6.1',
'$TOT': 241552,
'$PAR': 20,
'$MODE': 'L',
'$BYTEORD': '4,3,2,1',
'$DATATYPE': 'F',
'$NEXTDATA': 0,
'CREATOR': 'BD FACSDiva Software Version 8.0',
'TUBE NAME': 'T_E',
'$SRC': '2-13-17 T cell Panel',
'EXPERIMENT NAME': 'T_Memory_01-24-17',
'GUID': '641fcb4b-10df-4636-9325-31d9c563ae6b',
'$DATE': '10-SEP-2018',
'$BTIM': '16:02:38',
'$ETIM': '16:02:38',
'$CYT': 'LSRFortessa',
'SETTINGS': 'Cytometer',
'CYTNUM': 'H64717700086',
'WINDOW EXTENSION': '10.00',
'EXPORT USER NAME': 'Administrator',
'EXPORT TIME': '10-SEP-2018-16:02:38',
'$OP': 'Administrator',
'FSC ASF': '0.69',
'AUTOBS': 'TRUE',
'$INST': ' ',
'LASER1NAME': 'Blue',
'LASER1DELAY': '0.00',
'LASER1ASF': '0.78',
'LASER2NAME': 'Green',
'LASER2DELAY': '129.57',
'LASER2ASF': '0.75',
'LASER3NAME': 'Red',
'LASER3DELAY': '97.14',
'LASER3ASF': '0.57',
'LASER4NAME': 'UV',
'LASER4DELAY': '65.51',
'LASER4ASF': '0.77',
'LASER5NAME': 'Violet',
'LASER5DELAY': '34.39',
'LASER5ASF': '0.88',
'PLATE NAME': '2-13-17 T-memory',
'WELL ID': 'G05',
'PLATE ID': 'dac8255f-b7a7-4020-97d4-b2f6547e9b8b',
'$TIMESTEP': '0.01',
'APPLY COMPENSATION': 'TRUE',
'THRESHOLD': 'FSC,5000',
'P1DISPLAY': 'LIN',
'P1BS': '0',
'P1MS': '0',
'P2DISPLAY': 'LIN',
'P2BS': '0',
'P2MS': '0',
'P3BS': '-1',
'P3MS': '0',
'P4DISPLAY': 'LIN',
'P4BS': '0',
'P4MS': '0',
'P5DISPLAY': 'LIN',
'P5BS': '0',
'P5MS': '0',
'P6BS': '-1',
'P6MS': '0',
'P7DISPLAY': 'LOG',
'P7BS': '5464',
'P7MS': '0',
'P8DISPLAY': 'LOG',
'P8BS': '157',
'P8MS': '0',
'P9DISPLAY': 'LOG',
'P9BS': '102',
'P9MS': '0',
'P10DISPLAY': 'LOG',
'P10BS': '4284',
'P10MS': '0',
'P11DISPLAY': 'LOG',
'P11BS': '682',
'P11MS': '0',
'P12DISPLAY': 'LOG',
'P12BS': '177',
'P12MS': '0',
'P13DISPLAY': 'LOG',
'P13BS': '2348',
'P13MS': '0',
'P14DISPLAY': 'LOG',
'P14BS': '2322',
'P14MS': '0',
'P15DISPLAY': 'LOG',
'P15BS': '700',
'P15MS': '0',
'P16DISPLAY': 'LOG',
'P16BS': '679',
'P16MS': '0',
'P17DISPLAY': 'LOG',
'P17BS': '4480',
'P17MS': '0',
'P18DISPLAY': 'LOG',
'P18BS': '3799',
'P18MS': '0',
'P19DISPLAY': 'LOG',
'P19BS': '225',
'P19MS': '0',
'P20BS': '0',
'P20MS': '0',
'CST SETUP STATUS': 'SUCCESS',
'CST BEADS LOT ID': '74538',
'CYTOMETER CONFIG NAME': 'Copy of 5 Lasers UV SORP 2B 6V 2UV 3R 5Gr',
'CYTOMETER CONFIG CREATE DATE': '2014-01-29T14:36:56-08:00',
'CST SETUP DATE': '2016-12-21T08:52:55-08:00',
'CST BASELINE DATE': '2016-10-28T10:11:58-07:00',
'CST BEADS EXPIRED': 'False',
'CST PERFORMANCE EXPIRED': '2016-12-22T08:52:55-08:00',
'CST REGULATORY STATUS': 'RUO Performance Check',
'channels': $PnN $PnS $PnR $PnB $PnE $PnV $PnG
n
1 FSC-A 262144 32 0,0 510 1.0
2 FSC-H 262144 32 0,0 510 1.0
3 FSC-W 262144 32 0,0 510 1.0
4 SSC-A 262144 32 0,0 310 1.0
5 SSC-H 262144 32 0,0 310 1.0
6 SSC-W 262144 32 0,0 310 1.0
7 R660-A CD95 262144 32 0,0 490 1.0
8 R780-A CD8 262144 32 0,0 475 1.0
9 B515-A CD27 262144 32 0,0 470 1.0
10 B710-A CXCR4 262144 32 0,0 417 1.0
11 V450-A CCR7 262144 32 0,0 400 1.0
12 V545-A LIVE/DEAD 262144 32 0,0 495 1.0
13 V605-A CD4 262144 32 0,0 400 1.0
14 V655-A CD45RA 262144 32 0,0 375 1.0
15 V800-A CD3 262144 32 0,0 400 1.0
16 G560-A CD49B 262144 32 0,0 400 1.0
17 G610-A CD14/19 262144 32 0,0 415 1.0
18 G660-A CD69 262144 32 0,0 470 1.0
19 G780-A CD103 262144 32 0,0 435 1.0
20 Time 262144 32 0,0 0.01,
'header': {'FCS format': 'FCS3.0',
'text start': 256,
'text end': 6333,
'data start': 6339,
'data end': 19330498,
'analysis start': 0,
'analysis end': 0},
'spill': CD95 CD8 CD27 CXCR4 CCR7 LIVE/DEAD \
CD95 1.000000 0.097352 0.000000 0.007011 0.003501 0.000000
CD8 0.067916 1.000000 0.000000 0.000000 0.023879 0.000257
CD27 0.007903 0.000000 1.000000 0.007492 0.010284 0.027712
CXCR4 0.054363 0.100434 0.000000 1.000000 0.024458 0.001439
CCR7 0.002288 0.000000 0.000000 0.000000 1.000000 0.034874
LIVE/DEAD 0.000000 0.000000 0.003884 0.000705 0.014092 1.000000
CD4 0.009741 0.000263 0.000000 0.028274 0.080674 0.005858
CD45RA 0.275534 0.028670 0.000000 0.015079 0.102571 0.005517
CD3 0.022068 0.073814 0.000000 0.000000 0.099510 0.006832
CD49B 0.001869 0.000000 0.000000 0.048687 0.002103 0.009783
CD14/19 0.006566 0.000262 0.000000 0.177725 0.006049 0.000889
CD69 0.191802 0.032179 0.000000 0.396688 0.003517 0.000172
CD103 0.005300 0.105676 0.000000 0.016745 0.006381 0.000771
CD4 CD45RA CD3 CD49B CD14/19 CD69 \
CD95 0.000354 0.040952 0.008773 0.000067 0.001176 0.181536
CD8 0.000000 0.003852 0.100139 0.000877 0.000000 0.008990
CD27 0.003897 0.001299 0.000216 0.012664 0.002588 0.000000
CXCR4 0.000000 0.056397 0.194799 0.000491 0.000000 0.400014
CCR7 0.003729 0.000909 0.000282 0.000107 0.000000 0.000000
LIVE/DEAD 0.447288 0.144758 0.025271 0.000000 0.000000 0.000000
CD4 1.000000 0.434510 0.085092 0.055112 0.390481 0.290524
CD45RA 0.180690 1.000000 0.169154 0.000643 0.014777 0.120247
CD3 0.000891 0.003268 1.000000 0.000507 0.000000 0.000000
CD49B 0.038672 0.008945 0.001060 1.000000 0.400143 0.148085
CD14/19 0.065991 0.022692 0.003962 0.124522 1.000000 0.493361
CD69 0.000497 0.026221 0.009804 0.022587 0.010298 1.000000
CD103 0.000631 0.000561 0.126363 0.049163 0.018982 0.008683
CD103
CD95 0.005969
CD8 0.083250
CD27 0.000000
CXCR4 0.119091
CCR7 0.000000
LIVE/DEAD 0.000000
CD4 0.012522
CD45RA 0.005286
CD3 0.027061
CD49B 0.004221
CD14/19 0.019125
CD69 0.050517
CD103 1.000000 }
Missing marker column#
In some FCS files, the marker information does not follow the $P[0-9]S
pattern, and reading the FCS file might fail. You can set the reindex=False
option when reading the FCS files.
adata = pm.io.read_fcs(path_data, reindex=False)
adata
AnnData object with n_obs × n_vars = 241552 × 20
var: 'channel', 'marker', '$PnR', '$PnB', '$PnE', '$PnV', '$PnG'
uns: 'meta'
The .var
section of the AnnData object contains the channel information. Here we use a running number as var_names
. The marker names may be created manually from the channel
column.
adata.var
channel | marker | $PnR | $PnB | $PnE | $PnV | $PnG | |
---|---|---|---|---|---|---|---|
n | |||||||
1 | FSC-A | 262144 | 32 | 0,0 | 510 | 1.0 | |
2 | FSC-H | 262144 | 32 | 0,0 | 510 | 1.0 | |
3 | FSC-W | 262144 | 32 | 0,0 | 510 | 1.0 | |
4 | SSC-A | 262144 | 32 | 0,0 | 310 | 1.0 | |
5 | SSC-H | 262144 | 32 | 0,0 | 310 | 1.0 | |
6 | SSC-W | 262144 | 32 | 0,0 | 310 | 1.0 | |
7 | R660-A | CD95 | 262144 | 32 | 0,0 | 490 | 1.0 |
8 | R780-A | CD8 | 262144 | 32 | 0,0 | 475 | 1.0 |
9 | B515-A | CD27 | 262144 | 32 | 0,0 | 470 | 1.0 |
10 | B710-A | CXCR4 | 262144 | 32 | 0,0 | 417 | 1.0 |
11 | V450-A | CCR7 | 262144 | 32 | 0,0 | 400 | 1.0 |
12 | V545-A | LIVE/DEAD | 262144 | 32 | 0,0 | 495 | 1.0 |
13 | V605-A | CD4 | 262144 | 32 | 0,0 | 400 | 1.0 |
14 | V655-A | CD45RA | 262144 | 32 | 0,0 | 375 | 1.0 |
15 | V800-A | CD3 | 262144 | 32 | 0,0 | 400 | 1.0 |
16 | G560-A | CD49B | 262144 | 32 | 0,0 | 400 | 1.0 |
17 | G610-A | CD14/19 | 262144 | 32 | 0,0 | 415 | 1.0 |
18 | G660-A | CD69 | 262144 | 32 | 0,0 | 470 | 1.0 |
19 | G780-A | CD103 | 262144 | 32 | 0,0 | 435 | 1.0 |
20 | Time | 262144 | 32 | 0,0 | 0.01 |