seabirdfilehandler 0.5.2__tar.gz → 0.5.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of seabirdfilehandler might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: seabirdfilehandler
3
- Version: 0.5.2
3
+ Version: 0.5.3
4
4
  Summary: Library of parsers to interact with SeaBird CTD files.
5
5
  Keywords: CTD,parser,seabird,data
6
6
  Author: Emil Michels
@@ -20,9 +20,33 @@ Project-URL: Homepage, https://git.io-warnemuende.de/CTD-Software/SeabirdFileHan
20
20
  Project-URL: Repository, https://git.io-warnemuende.de/CTD-Software/SeabirdFileHandler
21
21
  Description-Content-Type: text/markdown
22
22
 
23
+ # Intro
24
+
23
25
  This is a library for handling the different SeaBird file types. Each file is
24
26
  meant to be represented by one object that stores all of its information in a
25
27
  structured way. Through the grouping of different data types, more complex
26
- calculations, visualisations and output forms will be possible inside of those
28
+ calculations, visualisations and output forms are possible inside of those
27
29
  objects.
28
30
 
31
+ By being able to parse edited data and metadata back to the original file
32
+ format, this package can be used to process data using custom ideas, while
33
+ staying compatible to the original SeaBird software packages. This way, one can
34
+ create new workflows that interchangeably use old and new processing modules.
35
+ One implementation of this idea is the [ctd-processing python package](https://ctd-software.pages.io-warnemuende.de/processing/), also developed at the IOW.
36
+
37
+ The structured metadata does provide the possibility to leverage the vast
38
+ amounts of information stored inside the extensive metadata header. Sensor data
39
+ and processing information are readily available in intuitive dictionaries.
40
+
41
+ ## Development roadmap
42
+
43
+ ### misc improvements
44
+
45
+ - refactor processing module handling
46
+ - extend individual parameter information
47
+ - handle duplicate input columns
48
+
49
+ ### visualisation
50
+
51
+ - write an intuitive visualisation module
52
+
@@ -0,0 +1,29 @@
1
+ # Intro
2
+
3
+ This is a library for handling the different SeaBird file types. Each file is
4
+ meant to be represented by one object that stores all of its information in a
5
+ structured way. Through the grouping of different data types, more complex
6
+ calculations, visualisations and output forms are possible inside of those
7
+ objects.
8
+
9
+ By being able to parse edited data and metadata back to the original file
10
+ format, this package can be used to process data using custom ideas, while
11
+ staying compatible to the original SeaBird software packages. This way, one can
12
+ create new workflows that interchangeably use old and new processing modules.
13
+ One implementation of this idea is the [ctd-processing python package](https://ctd-software.pages.io-warnemuende.de/processing/), also developed at the IOW.
14
+
15
+ The structured metadata does provide the possibility to leverage the vast
16
+ amounts of information stored inside the extensive metadata header. Sensor data
17
+ and processing information are readily available in intuitive dictionaries.
18
+
19
+ ## Development roadmap
20
+
21
+ ### misc improvements
22
+
23
+ - refactor processing module handling
24
+ - extend individual parameter information
25
+ - handle duplicate input columns
26
+
27
+ ### visualisation
28
+
29
+ - write an intuitive visualisation module
@@ -19,7 +19,7 @@ classifiers = [
19
19
  urls.homepage = "https://git.io-warnemuende.de/CTD-Software/SeabirdFileHandler"
20
20
  urls.repository = "https://git.io-warnemuende.de/CTD-Software/SeabirdFileHandler"
21
21
  dynamic = []
22
- version = "0.5.2"
22
+ version = "0.5.3"
23
23
 
24
24
  [tool.poetry]
25
25
 
@@ -43,6 +43,7 @@ pyment = ">=0.3.3"
43
43
  pylint = ">=3.0.2"
44
44
  pre-commit = ">=3.6.2"
45
45
  tomlkit = ">=0.13.2"
46
+ myst-parser = "^4.0.1"
46
47
 
47
48
  [tool.pytest.ini_options]
48
49
  pythonpath = [".", "src", "src/seabirdfilehandler"]
@@ -19,21 +19,26 @@ class CnvFile(DataFile):
19
19
  be able to use this representation for all applications concerning cnv
20
20
  files, like data processing, transformation or visualization.
21
21
 
22
- To achieve that, the metadata header is organized by the grandparent-class,
23
- SeaBirdFile, while the data table is extracted by this class. The data
24
- representation of choice is a pandas Dataframe. Inside this class, there
25
- are methods to parse cnv data into dataframes, do the reverse of writing a
26
- dataframe into cnv compliant form and to manipulate the dataframe in
27
- various ways.
22
+ To achieve that, the metadata header is organized by the parent-class,
23
+ DataFile, while the data table is extracted by this class. The data
24
+ representation can be a numpy array or pandas dataframe. The handling of
25
+ the data is mostly done inside parameters, a representation of the
26
+ individual measurement parameter data and metadata.
27
+
28
+ This class is also able to parse the edited data and metadata back to the
29
+ original .cnv file format, allowing for custom data processing using this
30
+ representation, while still being able to use Sea-Birds original software
31
+ on that output. It also allows to stay comparable with other parsers or
32
+ methods in general.
28
33
 
29
34
  Parameters
30
35
  ----------
31
36
  path_to_file: Path | str:
32
37
  the path to the file
33
- full_data_header: bool:
34
- whether to use the full data column descriptions for the dataframe
35
- long_header_names: bool:
36
- whether to use long header names in the dateframe
38
+ only_header: bool :
39
+ Whether to stop reading the file after the metadata header.
40
+ create_dataframe: bool :
41
+ Whether to create a pandas DataFrame from the data table.
37
42
  absolute_time_calculation: bool:
38
43
  whether to use a real timestamp instead of the second count
39
44
  event_log_column: bool:
@@ -18,15 +18,21 @@ logger = logging.getLogger(__name__)
18
18
 
19
19
 
20
20
  class DataFile:
21
- """Collection of methods for the SeaBird files that feature some kind of
22
- data table that is represented in a pandas dataframe.
21
+ """
22
+ The base class for all Sea-Bird data files, which are .cnv, .btl, and .bl .
23
+ One instance of this class, or its children, represents one data text file.
24
+ The different information bits of such a file are structured into individual
25
+ lists or dictionaries. The data table will be loaded as numpy array and
26
+ can be converted to a pandas DataFrame. Datatype-specific behavior is
27
+ implemented in the subclasses.
28
+
23
29
 
24
30
  Parameters
25
31
  ----------
26
-
27
- Returns
28
- -------
29
-
32
+ path_to_file: Path | str :
33
+ The file to the data file.
34
+ only_header: bool :
35
+ Whether to stop reading the file after the metadata header.
30
36
  """
31
37
 
32
38
  def __init__(
@@ -66,16 +72,10 @@ class DataFile:
66
72
  return self.file_data == other.file_data
67
73
 
68
74
  def read_file(self):
69
- """Reads and structures all the different information present in the
75
+ """
76
+ Reads and structures all the different information present in the
70
77
  file. Lists and Dictionaries are the data structures of choice. Uses
71
78
  basic prefix checking to distinguish different header information.
72
-
73
- Parameters
74
- ----------
75
-
76
- Returns
77
- -------
78
-
79
79
  """
80
80
  past_sensors = False
81
81
  with self.path_to_file.open("r", encoding="latin-1") as file:
@@ -109,14 +109,18 @@ class DataFile:
109
109
  def sensor_xml_to_flattened_dict(
110
110
  self, sensor_data: str
111
111
  ) -> list[dict] | dict:
112
- """Reads the pure xml sensor input and creates a multilevel dictionary,
112
+ """
113
+ Reads the pure xml sensor input and creates a multilevel dictionary,
113
114
  dropping the first two dictionaries, as they are single entry only
114
115
 
115
116
  Parameters
116
117
  ----------
118
+ sensor_data: str:
119
+ The raw xml sensor data.
117
120
 
118
121
  Returns
119
122
  -------
123
+ A list of sensor information, which is a structured dict.
120
124
 
121
125
  """
122
126
  full_sensor_dict = xmltodict.parse(sensor_data, process_comments=True)
@@ -153,8 +157,9 @@ class DataFile:
153
157
  return tidied_sensor_list
154
158
 
155
159
  def structure_metadata(self, metadata_list: list) -> dict:
156
- """Creates a dictionary to store the metadata that is added by using
157
- werums dship API.
160
+ """
161
+ Creates a dictionary to store custom metadata, of which Sea-Bird allows
162
+ 12 lines in each file.
158
163
 
159
164
  Parameters
160
165
  ----------
@@ -181,7 +186,8 @@ class DataFile:
181
186
  file_name: str | None = None,
182
187
  file_type: str = ".csv",
183
188
  ) -> Path:
184
- """Creates a Path object holding the desired output path.
189
+ """
190
+ Creates a Path object holding the desired output path.
185
191
 
186
192
  Parameters
187
193
  ----------
@@ -209,14 +215,13 @@ class DataFile:
209
215
  output_file_path: Path | str | None = None,
210
216
  output_file_name: str | None = None,
211
217
  ):
212
- """Writes a csv from the current dataframe. Takes a list of columns to
213
- use, a boolean for writing the header and the output file parameters.
218
+ """
219
+ Writes a csv from the given data.
214
220
 
215
221
  Parameters
216
222
  ----------
217
- selected_columns : list :
218
- a list of columns to include in the csv
219
- (Default value = self.df.columns)
223
+ data: pd.DataFrame | np.ndarray :
224
+ The source data to use.
220
225
  with_header : boolean :
221
226
  indicating whether the header shall appear in the output
222
227
  (Default value = True)
@@ -246,7 +251,8 @@ class DataFile:
246
251
  list_of_columns: list | str,
247
252
  df: pd.DataFrame,
248
253
  ):
249
- """Alters the dataframe to only hold the given columns.
254
+ """
255
+ Alters the dataframe to only hold the given columns.
250
256
 
251
257
  Parameters
252
258
  ----------
@@ -18,10 +18,10 @@ class Parameters(UserDict):
18
18
 
19
19
  Parameters
20
20
  ----------
21
- data: list:
22
- The raw data as extraced by SeaBirdFile
23
- metadata: list,
24
- The raw metadata as extraced by SeaBirdFile
21
+ data: list
22
+ The raw data as extraced by DataFile
23
+ metadata: list
24
+ The raw metadata as extraced by DataFile
25
25
 
26
26
  Returns
27
27
  -------
@@ -42,6 +42,9 @@ class Parameters(UserDict):
42
42
  )
43
43
  self.data = self.create_parameter_instances()
44
44
 
45
+ def get_parameter_names(self) -> list[str]:
46
+ return [parameter["name"] for parameter in self.metadata.values()]
47
+
45
48
  def get_parameter_list(self) -> list[Parameter]:
46
49
  """ """
47
50
  return list(self.data.values())
@@ -1,5 +0,0 @@
1
- This is a library for handling the different SeaBird file types. Each file is
2
- meant to be represented by one object that stores all of its information in a
3
- structured way. Through the grouping of different data types, more complex
4
- calculations, visualisations and output forms will be possible inside of those
5
- objects.