csv/lib.rs
1/*!
2The `csv` crate provides a fast and flexible CSV reader and writer, with
3support for Serde.
4
5The [tutorial](tutorial/index.html) is a good place to start if you're new to
6Rust.
7
8The [cookbook](cookbook/index.html) will give you a variety of complete Rust
9programs that do CSV reading and writing.
10
11# Brief overview
12
13**If you're new to Rust**, you might find the
14[tutorial](tutorial/index.html)
15to be a good place to start.
16
17The primary types in this crate are
18[`Reader`](struct.Reader.html)
19and
20[`Writer`](struct.Writer.html),
21for reading and writing CSV data respectively.
22Correspondingly, to support CSV data with custom field or record delimiters
23(among many other things), you should use either a
24[`ReaderBuilder`](struct.ReaderBuilder.html)
25or a
26[`WriterBuilder`](struct.WriterBuilder.html),
27depending on whether you're reading or writing CSV data.
28
29Unless you're using Serde, the standard CSV record types are
30[`StringRecord`](struct.StringRecord.html)
31and
32[`ByteRecord`](struct.ByteRecord.html).
33`StringRecord` should be used when you know your data to be valid UTF-8.
34For data that may be invalid UTF-8, `ByteRecord` is suitable.
35
36Finally, the set of errors is described by the
37[`Error`](struct.Error.html)
38type.
39
40The rest of the types in this crate mostly correspond to more detailed errors,
41position information, configuration knobs or iterator types.
42
43# Setup
44
45Run `cargo add csv` to add the latest version of the `csv` crate to your
46Cargo.toml.
47
48If you want to use Serde's custom derive functionality on your custom structs,
49then run `cargo add serde --features derive` to add the `serde` crate with its
50`derive` feature enabled to your `Cargo.toml`.
51
52# Example
53
54This example shows how to read CSV data from stdin and print each record to
55stdout.
56
57There are more examples in the [cookbook](cookbook/index.html).
58
59```no_run
60use std::{error::Error, io, process};
61
62fn example() -> Result<(), Box<dyn Error>> {
63    // Build the CSV reader and iterate over each record.
64    let mut rdr = csv::Reader::from_reader(io::stdin());
65    for result in rdr.records() {
66        // The iterator yields Result<StringRecord, Error>, so we check the
67        // error here.
68        let record = result?;
69        println!("{:?}", record);
70    }
71    Ok(())
72}
73
74fn main() {
75    if let Err(err) = example() {
76        println!("error running example: {}", err);
77        process::exit(1);
78    }
79}
80```
81
82The above example can be run like so:
83
84```ignore
85$ git clone git://github.com/BurntSushi/rust-csv
86$ cd rust-csv
87$ cargo run --example cookbook-read-basic < examples/data/smallpop.csv
88```
89
90# Example with Serde
91
92This example shows how to read CSV data from stdin into your own custom struct.
93By default, the member names of the struct are matched with the values in the
94header record of your CSV data.
95
96```no_run
97use std::{error::Error, io, process};
98
99#[derive(Debug, serde::Deserialize)]
100struct Record {
101    city: String,
102    region: String,
103    country: String,
104    population: Option<u64>,
105}
106
107fn example() -> Result<(), Box<dyn Error>> {
108    let mut rdr = csv::Reader::from_reader(io::stdin());
109    for result in rdr.deserialize() {
110        // Notice that we need to provide a type hint for automatic
111        // deserialization.
112        let record: Record = result?;
113        println!("{:?}", record);
114    }
115    Ok(())
116}
117
118fn main() {
119    if let Err(err) = example() {
120        println!("error running example: {}", err);
121        process::exit(1);
122    }
123}
124```
125
126The above example can be run like so:
127
128```ignore
129$ git clone git://github.com/BurntSushi/rust-csv
130$ cd rust-csv
131$ cargo run --example cookbook-read-serde < examples/data/smallpop.csv
132```
133
134*/
135
136#![deny(missing_docs)]
137
138use std::result;
139
140use serde::{Deserialize, Deserializer};
141
142pub use crate::{
143    byte_record::{ByteRecord, ByteRecordIter, Position},
144    deserializer::{DeserializeError, DeserializeErrorKind},
145    error::{
146        Error, ErrorKind, FromUtf8Error, IntoInnerError, Result, Utf8Error,
147    },
148    reader::{
149        ByteRecordsIntoIter, ByteRecordsIter, DeserializeRecordsIntoIter,
150        DeserializeRecordsIter, Reader, ReaderBuilder, StringRecordsIntoIter,
151        StringRecordsIter,
152    },
153    string_record::{StringRecord, StringRecordIter},
154    writer::{Writer, WriterBuilder},
155};
156
157mod byte_record;
158pub mod cookbook;
159mod debug;
160mod deserializer;
161mod error;
162mod reader;
163mod serializer;
164mod string_record;
165pub mod tutorial;
166mod writer;
167
168/// The quoting style to use when writing CSV data.
169#[derive(Clone, Copy, Debug)]
170pub enum QuoteStyle {
171    /// This puts quotes around every field. Always.
172    Always,
173    /// This puts quotes around fields only when necessary.
174    ///
175    /// They are necessary when fields contain a quote, delimiter or record
176    /// terminator. Quotes are also necessary when writing an empty record
177    /// (which is indistinguishable from a record with one empty field).
178    ///
179    /// This is the default.
180    Necessary,
181    /// This puts quotes around all fields that are non-numeric. Namely, when
182    /// writing a field that does not parse as a valid float or integer, then
183    /// quotes will be used even if they aren't strictly necessary.
184    NonNumeric,
185    /// This *never* writes quotes, even if it would produce invalid CSV data.
186    Never,
187    /// Hints that destructuring should not be exhaustive.
188    ///
189    /// This enum may grow additional variants, so this makes sure clients
190    /// don't count on exhaustive matching. (Otherwise, adding a new variant
191    /// could break existing code.)
192    #[doc(hidden)]
193    __Nonexhaustive,
194}
195
196impl QuoteStyle {
197    fn to_core(self) -> csv_core::QuoteStyle {
198        match self {
199            QuoteStyle::Always => csv_core::QuoteStyle::Always,
200            QuoteStyle::Necessary => csv_core::QuoteStyle::Necessary,
201            QuoteStyle::NonNumeric => csv_core::QuoteStyle::NonNumeric,
202            QuoteStyle::Never => csv_core::QuoteStyle::Never,
203            _ => unreachable!(),
204        }
205    }
206}
207
208impl Default for QuoteStyle {
209    fn default() -> QuoteStyle {
210        QuoteStyle::Necessary
211    }
212}
213
214/// A record terminator.
215///
216/// Use this to specify the record terminator while parsing CSV. The default is
217/// CRLF, which treats `\r`, `\n` or `\r\n` as a single record terminator.
218#[derive(Clone, Copy, Debug)]
219pub enum Terminator {
220    /// Parses `\r`, `\n` or `\r\n` as a single record terminator.
221    CRLF,
222    /// Parses the byte given as a record terminator.
223    Any(u8),
224    /// Hints that destructuring should not be exhaustive.
225    ///
226    /// This enum may grow additional variants, so this makes sure clients
227    /// don't count on exhaustive matching. (Otherwise, adding a new variant
228    /// could break existing code.)
229    #[doc(hidden)]
230    __Nonexhaustive,
231}
232
233impl Terminator {
234    /// Convert this to the csv_core type of the same name.
235    fn to_core(self) -> csv_core::Terminator {
236        match self {
237            Terminator::CRLF => csv_core::Terminator::CRLF,
238            Terminator::Any(b) => csv_core::Terminator::Any(b),
239            _ => unreachable!(),
240        }
241    }
242}
243
244impl Default for Terminator {
245    fn default() -> Terminator {
246        Terminator::CRLF
247    }
248}
249
250/// The whitespace preservation behaviour when reading CSV data.
251#[derive(Clone, Copy, Debug, PartialEq)]
252pub enum Trim {
253    /// Preserves fields and headers. This is the default.
254    None,
255    /// Trim whitespace from headers.
256    Headers,
257    /// Trim whitespace from fields, but not headers.
258    Fields,
259    /// Trim whitespace from fields and headers.
260    All,
261    /// Hints that destructuring should not be exhaustive.
262    ///
263    /// This enum may grow additional variants, so this makes sure clients
264    /// don't count on exhaustive matching. (Otherwise, adding a new variant
265    /// could break existing code.)
266    #[doc(hidden)]
267    __Nonexhaustive,
268}
269
270impl Trim {
271    fn should_trim_fields(&self) -> bool {
272        self == &Trim::Fields || self == &Trim::All
273    }
274
275    fn should_trim_headers(&self) -> bool {
276        self == &Trim::Headers || self == &Trim::All
277    }
278}
279
280impl Default for Trim {
281    fn default() -> Trim {
282        Trim::None
283    }
284}
285
286/// A custom Serde deserializer for possibly invalid `Option<T>` fields.
287///
288/// When deserializing CSV data, it is sometimes desirable to simply ignore
289/// fields with invalid data. For example, there might be a field that is
290/// usually a number, but will occasionally contain garbage data that causes
291/// number parsing to fail.
292///
293/// You might be inclined to use, say, `Option<i32>` for fields such at this.
294/// By default, however, `Option<i32>` will either capture *empty* fields with
295/// `None` or valid numeric fields with `Some(the_number)`. If the field is
296/// non-empty and not a valid number, then deserialization will return an error
297/// instead of using `None`.
298///
299/// This function allows you to override this default behavior. Namely, if
300/// `Option<T>` is deserialized with non-empty but invalid data, then the value
301/// will be `None` and the error will be ignored.
302///
303/// # Example
304///
305/// This example shows how to parse CSV records with numerical data, even if
306/// some numerical data is absent or invalid. Without the
307/// `serde(deserialize_with = "...")` annotations, this example would return
308/// an error.
309///
310/// ```
311/// use std::error::Error;
312///
313/// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
314/// struct Row {
315///     #[serde(deserialize_with = "csv::invalid_option")]
316///     a: Option<i32>,
317///     #[serde(deserialize_with = "csv::invalid_option")]
318///     b: Option<i32>,
319///     #[serde(deserialize_with = "csv::invalid_option")]
320///     c: Option<i32>,
321/// }
322///
323/// # fn main() { example().unwrap(); }
324/// fn example() -> Result<(), Box<dyn Error>> {
325///     let data = "\
326/// a,b,c
327/// 5,\"\",xyz
328/// ";
329///     let mut rdr = csv::Reader::from_reader(data.as_bytes());
330///     if let Some(result) = rdr.deserialize().next() {
331///         let record: Row = result?;
332///         assert_eq!(record, Row { a: Some(5), b: None, c: None });
333///         Ok(())
334///     } else {
335///         Err(From::from("expected at least one record but got none"))
336///     }
337/// }
338/// ```
339pub fn invalid_option<'de, D, T>(de: D) -> result::Result<Option<T>, D::Error>
340where
341    D: Deserializer<'de>,
342    Option<T>: Deserialize<'de>,
343{
344    Option::<T>::deserialize(de).or_else(|_| Ok(None))
345}