csv/
lib.rs

1/*!
2The `csv` crate provides a fast and flexible CSV reader and writer, with
3support for Serde.
4
5The [tutorial](tutorial/index.html) is a good place to start if you're new to
6Rust.
7
8The [cookbook](cookbook/index.html) will give you a variety of complete Rust
9programs that do CSV reading and writing.
10
11# Brief overview
12
13**If you're new to Rust**, you might find the
14[tutorial](tutorial/index.html)
15to be a good place to start.
16
17The primary types in this crate are
18[`Reader`](struct.Reader.html)
19and
20[`Writer`](struct.Writer.html),
21for reading and writing CSV data respectively.
22Correspondingly, to support CSV data with custom field or record delimiters
23(among many other things), you should use either a
24[`ReaderBuilder`](struct.ReaderBuilder.html)
25or a
26[`WriterBuilder`](struct.WriterBuilder.html),
27depending on whether you're reading or writing CSV data.
28
29Unless you're using Serde, the standard CSV record types are
30[`StringRecord`](struct.StringRecord.html)
31and
32[`ByteRecord`](struct.ByteRecord.html).
33`StringRecord` should be used when you know your data to be valid UTF-8.
34For data that may be invalid UTF-8, `ByteRecord` is suitable.
35
36Finally, the set of errors is described by the
37[`Error`](struct.Error.html)
38type.
39
40The rest of the types in this crate mostly correspond to more detailed errors,
41position information, configuration knobs or iterator types.
42
43# Setup
44
45Run `cargo add csv` to add the latest version of the `csv` crate to your
46Cargo.toml.
47
48If you want to use Serde's custom derive functionality on your custom structs,
49then run `cargo add serde --features derive` to add the `serde` crate with its
50`derive` feature enabled to your `Cargo.toml`.
51
52# Example
53
54This example shows how to read CSV data from stdin and print each record to
55stdout.
56
57There are more examples in the [cookbook](cookbook/index.html).
58
59```no_run
60use std::{error::Error, io, process};
61
62fn example() -> Result<(), Box<dyn Error>> {
63    // Build the CSV reader and iterate over each record.
64    let mut rdr = csv::Reader::from_reader(io::stdin());
65    for result in rdr.records() {
66        // The iterator yields Result<StringRecord, Error>, so we check the
67        // error here.
68        let record = result?;
69        println!("{:?}", record);
70    }
71    Ok(())
72}
73
74fn main() {
75    if let Err(err) = example() {
76        println!("error running example: {}", err);
77        process::exit(1);
78    }
79}
80```
81
82The above example can be run like so:
83
84```ignore
85$ git clone git://github.com/BurntSushi/rust-csv
86$ cd rust-csv
87$ cargo run --example cookbook-read-basic < examples/data/smallpop.csv
88```
89
90# Example with Serde
91
92This example shows how to read CSV data from stdin into your own custom struct.
93By default, the member names of the struct are matched with the values in the
94header record of your CSV data.
95
96```no_run
97use std::{error::Error, io, process};
98
99#[derive(Debug, serde::Deserialize)]
100struct Record {
101    city: String,
102    region: String,
103    country: String,
104    population: Option<u64>,
105}
106
107fn example() -> Result<(), Box<dyn Error>> {
108    let mut rdr = csv::Reader::from_reader(io::stdin());
109    for result in rdr.deserialize() {
110        // Notice that we need to provide a type hint for automatic
111        // deserialization.
112        let record: Record = result?;
113        println!("{:?}", record);
114    }
115    Ok(())
116}
117
118fn main() {
119    if let Err(err) = example() {
120        println!("error running example: {}", err);
121        process::exit(1);
122    }
123}
124```
125
126The above example can be run like so:
127
128```ignore
129$ git clone git://github.com/BurntSushi/rust-csv
130$ cd rust-csv
131$ cargo run --example cookbook-read-serde < examples/data/smallpop.csv
132```
133
134*/
135
136#![deny(missing_docs)]
137
138use std::result;
139
140use serde_core::{Deserialize, Deserializer};
141
142pub use crate::{
143    byte_record::{ByteRecord, ByteRecordIter, Position},
144    deserializer::{DeserializeError, DeserializeErrorKind},
145    error::{
146        Error, ErrorKind, FromUtf8Error, IntoInnerError, Result, Utf8Error,
147    },
148    reader::{
149        ByteRecordsIntoIter, ByteRecordsIter, DeserializeRecordsIntoIter,
150        DeserializeRecordsIter, Reader, ReaderBuilder, StringRecordsIntoIter,
151        StringRecordsIter,
152    },
153    string_record::{StringRecord, StringRecordIter},
154    writer::{Writer, WriterBuilder},
155};
156
157mod byte_record;
158pub mod cookbook;
159mod debug;
160mod deserializer;
161mod error;
162mod reader;
163mod serializer;
164mod string_record;
165pub mod tutorial;
166mod writer;
167
168/// The quoting style to use when writing CSV data.
169#[derive(Clone, Copy, Debug, Default)]
170#[non_exhaustive]
171pub enum QuoteStyle {
172    /// This puts quotes around every field. Always.
173    Always,
174    /// This puts quotes around fields only when necessary.
175    ///
176    /// They are necessary when fields contain a quote, delimiter or record
177    /// terminator. Quotes are also necessary when writing an empty record
178    /// (which is indistinguishable from a record with one empty field).
179    ///
180    /// This is the default.
181    #[default]
182    Necessary,
183    /// This puts quotes around all fields that are non-numeric. Namely, when
184    /// writing a field that does not parse as a valid float or integer, then
185    /// quotes will be used even if they aren't strictly necessary.
186    NonNumeric,
187    /// This *never* writes quotes, even if it would produce invalid CSV data.
188    Never,
189}
190
191impl QuoteStyle {
192    fn to_core(self) -> csv_core::QuoteStyle {
193        match self {
194            QuoteStyle::Always => csv_core::QuoteStyle::Always,
195            QuoteStyle::Necessary => csv_core::QuoteStyle::Necessary,
196            QuoteStyle::NonNumeric => csv_core::QuoteStyle::NonNumeric,
197            QuoteStyle::Never => csv_core::QuoteStyle::Never,
198        }
199    }
200}
201
202/// A record terminator.
203///
204/// Use this to specify the record terminator while parsing CSV. The default is
205/// CRLF, which treats `\r`, `\n` or `\r\n` as a single record terminator.
206#[derive(Clone, Copy, Debug, Default)]
207#[non_exhaustive]
208pub enum Terminator {
209    /// Parses `\r`, `\n` or `\r\n` as a single record terminator.
210    #[default]
211    CRLF,
212    /// Parses the byte given as a record terminator.
213    Any(u8),
214}
215
216impl Terminator {
217    /// Convert this to the csv_core type of the same name.
218    fn to_core(self) -> csv_core::Terminator {
219        match self {
220            Terminator::CRLF => csv_core::Terminator::CRLF,
221            Terminator::Any(b) => csv_core::Terminator::Any(b),
222        }
223    }
224}
225
226/// The whitespace preservation behaviour when reading CSV data.
227#[derive(Clone, Copy, Debug, Default, PartialEq)]
228#[non_exhaustive]
229pub enum Trim {
230    /// Preserves fields and headers. This is the default.
231    #[default]
232    None,
233    /// Trim whitespace from headers.
234    Headers,
235    /// Trim whitespace from fields, but not headers.
236    Fields,
237    /// Trim whitespace from fields and headers.
238    All,
239}
240
241impl Trim {
242    fn should_trim_fields(&self) -> bool {
243        self == &Trim::Fields || self == &Trim::All
244    }
245
246    fn should_trim_headers(&self) -> bool {
247        self == &Trim::Headers || self == &Trim::All
248    }
249}
250
251/// A custom Serde deserializer for possibly invalid `Option<T>` fields.
252///
253/// When deserializing CSV data, it is sometimes desirable to simply ignore
254/// fields with invalid data. For example, there might be a field that is
255/// usually a number, but will occasionally contain garbage data that causes
256/// number parsing to fail.
257///
258/// You might be inclined to use, say, `Option<i32>` for fields such at this.
259/// By default, however, `Option<i32>` will either capture *empty* fields with
260/// `None` or valid numeric fields with `Some(the_number)`. If the field is
261/// non-empty and not a valid number, then deserialization will return an error
262/// instead of using `None`.
263///
264/// This function allows you to override this default behavior. Namely, if
265/// `Option<T>` is deserialized with non-empty but invalid data, then the value
266/// will be `None` and the error will be ignored.
267///
268/// # Example
269///
270/// This example shows how to parse CSV records with numerical data, even if
271/// some numerical data is absent or invalid. Without the
272/// `serde(deserialize_with = "...")` annotations, this example would return
273/// an error.
274///
275/// ```
276/// use std::error::Error;
277///
278/// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
279/// struct Row {
280///     #[serde(deserialize_with = "csv::invalid_option")]
281///     a: Option<i32>,
282///     #[serde(deserialize_with = "csv::invalid_option")]
283///     b: Option<i32>,
284///     #[serde(deserialize_with = "csv::invalid_option")]
285///     c: Option<i32>,
286/// }
287///
288/// # fn main() { example().unwrap(); }
289/// fn example() -> Result<(), Box<dyn Error>> {
290///     let data = "\
291/// a,b,c
292/// 5,\"\",xyz
293/// ";
294///     let mut rdr = csv::Reader::from_reader(data.as_bytes());
295///     if let Some(result) = rdr.deserialize().next() {
296///         let record: Row = result?;
297///         assert_eq!(record, Row { a: Some(5), b: None, c: None });
298///         Ok(())
299///     } else {
300///         Err(From::from("expected at least one record but got none"))
301///     }
302/// }
303/// ```
304pub fn invalid_option<'de, D, T>(de: D) -> result::Result<Option<T>, D::Error>
305where
306    D: Deserializer<'de>,
307    Option<T>: Deserialize<'de>,
308{
309    Option::<T>::deserialize(de).or_else(|_| Ok(None))
310}