GFAParser: module to parse and write GFA format#
- class gfagraphs.gfaparser.GFAParser#
This class implements static methods to get informations about the contents of a GFA file, and to parse them.
- Returns
Methods are static and should be used passing arguments.
- Return type
None
- Raises
OSError – The file does not exists
IOError – File is empty
IOError – File descriptor is invalid
NotImplementedError – Byte-array or array is saved to GFA
ValueError – Data format not in GFA standards
- static get_gfa_format(gfa_file_path: str | list[str]) str | list[str]#
Given a file, or more, returns the gfa subtypes, and raises error if file is invalid or does not exists. Objective is to asses GFA subformat on files for pre-processing purposes, or algorithm choices.
- Parameters
gfa_file_path (str | list[str]) – a series of paths, or a single one
- Returns
per path, a tag identifying the gfa type
- Return type
str | list[str]
- Raises
OSError – Specified file does not exists
IOError – File descriptor is invalid
IOError – File is empty
- static get_gfa_type(tag_type: str) Union[type, Callable]#
Interprets tags of GFA as a Python-compatible format. Given a letter used as a tag in the GFA standard, return the type or function to cast the data to. This function is used in input scenarios, to read a file from disk and interpret its content
- Parameters
tag_type (str) – a GFA tag
- Returns
a cast descriptor to use on the data
- Return type
type | Callable
- Raises
NotImplementedError – Byte-array or array
ValueError – Type identifer is not in the GFA-spec
- static get_python_type(data: object) str#
From a python variable, tries to identify the best suiting tag, and validates it. See http://gfa-spec.github.io/GFA-spec/GFA1.html#optional-fields for more details.
- Parameters
data (object) – the data we try to add to the GFA file
- Returns
a one-letter code for an optional filed of the GFA-spec
- Return type
str
- Raises
ValueError – data type could not be encoded in the GFA-spec
- static read_gfa_line(datas: list[str], load_sequence_in_memory: bool = True, regexp_pattern: str = '.*', memory_mode: bool = False) tuple[str, gfagraphs.abstractions.GFALine, dict]#
Calls methods to parse a GFA line, accordingly to it’s fields described in the GFAspec github. Parses a single line and return the information it contains
- Parameters
datas (list[str]) – the list of tab-separated elements of the GFA line.
load_sequence_in_memory (bool, optional) – if it is a node, if the sequance should be or not loaded, by default True
regexp_pattern (str, optional) – a pattern to keep for path names, by default “.*”
memory_mode (bool, optional) – if additional information should be loaded in the struct, by default True
- Returns
Contains id_of_line, type_of_line, datas_of_line
- Return type
tuple[str, GFALine, dict]
- static save_graph(graph, output_path: str, force_format: gfagraphs.abstractions.GFAFormat | bool = False, minimal_graph: bool = False) None#
Given a gfa Graph object, saves to a valid gfa file the Graph.
- Parameters
- static save_subgraph(graph, output_path: str, nodes: set[str], force_format: gfagraphs.abstractions.GFAFormat | bool = False, minimal_graph: bool = False) None#
Given a gfa Graph object, saves to a valid gfa file the Graph.
- Parameters
- static set_gfa_type(tag_type: str) Union[type, Callable]#
Interprets tags of GFA as a Python-compatible format. Given a letter used as a tag in the GFA standard, return the type or function to cast the data to. This function is used in output scenarios, to write a file to disk.
- Parameters
tag_type (str) – a GFA tag
- Returns
a cast descriptor to use on the data
- Return type
type | Callable
- static supplementary_datas(datas: list, length_condition: int) dict#
Computes the optional tags of a gfa line and returns them as a dict.
- Parameters
datas (list) – a list of tags and their values
length_condition (int) – the tags that are mandatory (and already processed)
- Returns
interpreted tags in their right types
- Return type
dict
- gfagraphs.gfaparser.path_allocator(path_to_validate: str, particle: str | None = None, default_name: str = 'file', always_yes: bool = True) str#
Checks if a file exists in this place, and arborescence exists. If not, creates the arborescence
- Args:
path_to_validate (str): a string path to the file particle (str | None, optional): file extension. Defaults to None. default_name (str): a name if name is empty always_yes (bool, optional): if file shall be erased by default. Defaults to True.
- Returns:
str: the path to the file, with extension