Skip to content
arctic

Querying

Read imported records back out of the Parquet shards with filters on author, score, date, type, and a substring match.

query reads records back out of the Parquet shards of an entity you have already pulled or fetched. It scans the shards for a subreddit or account, applies the filters you pass, and renders the matching records through the same formatter every command uses.

arctic query golang

With no filters it renders every imported record for r/golang. The interesting part is the filters.

Subreddit or user

The argument is a subreddit by default. To read an account instead, prefix it with u/ or pass --user:

arctic query golang              # the subreddit r/golang
arctic query spez --user         # the account u/spez
arctic query u/spez              # the same account, by prefix

Both forms accept the loose prefixes (r/golang, /u/spez) and normalize them.

Filters

Every filter narrows the scan. Combine as many as you want; a record must match all of them.

By author

arctic query golang --author rob_pike

--author matches the record's author exactly, normalized the same way the entity arguments are, so u/rob_pike and rob_pike both work.

By substring

--contains does a case-insensitive substring match against the body of a comment and the title and selftext of a submission:

arctic query golang --contains "context deadline"
arctic query golang --contains generics --kind submissions

By score

--min-score keeps only records at or above a score:

arctic query golang --min-score 100
arctic query golang --author rob_pike --min-score 500

By date

--after and --before bound the record's creation time. Each takes YYYY, YYYY-MM, or YYYY-MM-DD:

arctic query golang --after 2024-01-01
arctic query golang --after 2020 --before 2021
arctic query golang --after 2024-01 --before 2024-04 --contains generics

By type

--kind chooses which record types to scan: comments, submissions, or both (the default):

arctic query golang --kind submissions
arctic query golang --kind comments --author rob_pike

When you ask for a single type, the output columns match that type. When you ask for both and the result has both, comments render first and submissions after.

Limiting the result

-n (the global --limit) caps the number of records returned; 0, the default, means no limit:

arctic query golang --contains generics -n 20

How the scan works

query reads the Parquet shards for the entity and type under the data directory and filters them as it goes. It reads the columnar shards directly, so it does not depend on the SQLite index, and it covers everything you imported for that entity regardless of which acquisition path or month it came from, because every shard shares the same fixed schema. A query that matches nothing exits with code 3 (no data), so a script can tell an empty result from an error.

Composing

Because the output adapts to the destination, a query pipes cleanly. Pull one field out of the matches:

arctic query golang --contains generics -o jsonl | jq -r .author | sort | uniq -c

Keep a few columns for a quick look:

arctic query golang --kind submissions --min-score 1000 --fields author,score,title

Apply a template per record:

arctic query golang --contains generics --template '{{.score}}	{{.title}}'

See output formats for the full set of renderings and the troubleshooting reference for what the exit codes mean.