| Title: | Check Text Files Content at a Glance |
|---|---|
| Description: | Tools to help text files importation. It can return the number of lines; print the first and last lines; convert encoding; guess delimiters and file encoding. Operations are made without reading the entire file before starting, resulting in good performances with large files. This package provides an alternative to a simple use of the 'head', 'tail', 'wc' and 'iconv' programs that are not always available on machine where R is installed. |
| Authors: | David Gohel [aut, cre] |
| Maintainer: | David Gohel <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.2.001 |
| Built: | 2026-05-25 17:19:44 UTC |
| Source: | https://github.com/davidgohel/fpeek |
return the number of lines found in a file. Operation is counting the number of new line symbols in the file.
peek_count_lines(path, with_eof = FALSE)peek_count_lines(path, with_eof = FALSE)
path |
file path |
with_eof |
count the end of file as a new line. |
number of lines as an integer
f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_count_lines(f)f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_count_lines(f)
Guess the delimiter, quote character and decimal mark of a
delimited text file. The function splits each of the first
n lines by each candidate delimiter and selects the
delimiter that produces the most consistent number of fields.
The algorithm is adapted from the vroom package.
peek_guess_delim( path, delims = c(",", "\t", " ", "|", ":", ";"), quotes = c("\"", "'"), n = 1024 )peek_guess_delim( path, delims = c(",", "\t", " ", "|", ":", ";"), quotes = c("\"", "'"), n = 1024 )
path |
path to the text file. |
delims |
character vector of candidate delimiters. |
quotes |
character vector of candidate quote characters. |
n |
number of lines to read for guessing (default 1024). |
a named list with elements:
the guessed delimiter character (or NULL)
the guessed quote character (or NULL)
the guessed decimal mark (or NULL)
f <- system.file(package = "fpeek", "datafiles", "test-comma.csv") peek_guess_delim(f) f <- system.file(package = "fpeek", "datafiles", "test-semicolon.csv") peek_guess_delim(f)f <- system.file(package = "fpeek", "datafiles", "test-comma.csv") peek_guess_delim(f) f <- system.file(package = "fpeek", "datafiles", "test-semicolon.csv") peek_guess_delim(f)
Detect the encoding of a text file. This function is a
wrapper around guess_encoding from
the readr package, returning the best candidate
as a character string.
readr must be installed (it is listed in Suggests).
If it is not available, the function stops with an informative
message.
peek_guess_encoding(path)peek_guess_encoding(path)
path |
path to the text file. |
a character string giving the most likely encoding.
## Not run: f <- system.file(package = "fpeek", "datafiles", "cigfou-ISO-8859-1.txt") peek_guess_encoding(f) ## End(Not run)## Not run: f <- system.file(package = "fpeek", "datafiles", "cigfou-ISO-8859-1.txt") peek_guess_encoding(f) ## End(Not run)
print the first n lines
of a file.
peek_head(path, n = 10, intern = FALSE)peek_head(path, n = 10, intern = FALSE)
path |
file path |
n |
number of lines to print |
intern |
a logical which indicates whether to capture the output as an R character vector or to print the output in the R console. |
f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_head(f, n = 4) peek_head(f, n = 4, intern = TRUE)f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_head(f, n = 4) peek_head(f, n = 4, intern = TRUE)
Read a file, convert the encoding of characters and print the result.
peek_iconv(path, from, to = "UTF-8", newfile = NULL)peek_iconv(path, from, to = "UTF-8", newfile = NULL)
path |
file path |
from |
the code set in which the input is encoded. |
to |
the code set to which the output is to be converted. |
newfile |
result file. Default to NULL. If null the result will be print in the R console, otherwise a file is produced containing the result. |
la_cigale <- system.file(package = "fpeek", "datafiles", "cigfou-ISO-8859-1.txt") peek_iconv(la_cigale, from = "ISO-8859-1", to = "UTF-8") newfile <- tempfile() peek_iconv(la_cigale, from = "ISO-8859-1", to = "UTF-8", newfile = newfile) peek_head(newfile, n = 10)la_cigale <- system.file(package = "fpeek", "datafiles", "cigfou-ISO-8859-1.txt") peek_iconv(la_cigale, from = "ISO-8859-1", to = "UTF-8") newfile <- tempfile() peek_iconv(la_cigale, from = "ISO-8859-1", to = "UTF-8", newfile = newfile) peek_head(newfile, n = 10)
print the last n lines
of a file.
peek_tail(path, n = 10, intern = FALSE)peek_tail(path, n = 10, intern = FALSE)
path |
file path |
n |
number of lines to print |
intern |
a logical which indicates whether to capture the output as an R character vector or to print the output in the R console. |
f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_tail(f, n = 4) peek_tail(f, n = 4, intern = TRUE)f <- system.file(package = "fpeek", "datafiles", "test-tab.csv") peek_tail(f, n = 4) peek_tail(f, n = 4, intern = TRUE)