AI File Organizer for Researchers (Zotero, Academic PDFs) | Sortio
Sortio for Researchers

AI file organizer for researchers. Rename PDFs by author and year.

Sortio renames academic PDFs from content (author, year, title), routes them into project folders, and syncs into Zotero collections via the upcoming Zotero integration. Built for PhD students, postdocs, principal investigators, and research staff with libraries of 5,000+ papers and no time to file by hand.

How academic file organization works

Four steps from a Downloads folder of unnamed PDFs to a clean project library synced with Zotero.

Step 1

Drop your PDF pile

Point Sortio at your Downloads folder, your Zotero storage directory, or any folder of downloaded papers. Preprints, conference proceedings, dissertation chapters, supplemental data, slide decks: anything you drag in.

Step 2

Read title, authors, DOI, year

Sortio parses each PDF for the title, first author, year, journal or conference, DOI, and abstract. Optionally enriches against Crossref to fix sparse metadata in older OCR'd scans. All inference can run locally with Ollama so unpublished work never leaves your machine.

Step 3

Route by project and collection

Each paper lands in the right project folder (PhD thesis, current grant, lit review, collaboration with X). The Rule Builder lets you express things like "if the abstract mentions transformer architectures, route under deep-learning/2023+ and tag for the methods chapter."

Step 4

Sync to Zotero

When the Zotero integration ships in the next release, files routed to a project upload directly into the matching Zotero collection as an item with attachment. Metadata is preserved, citation key is generated, and your library stays the canonical reference.

The research file management problem

Six patterns we hear from PhD students, postdocs, and lab PIs running multi-project research with Zotero and a stack of PDF chaos.

5,000 PDFs, none named usefully

Every browser download saves as ssrn-id3942817.pdf, 1234.5678v2.pdf, or paper.pdf. After two years of literature review, your Downloads folder has 5,000 PDFs and you cannot tell what is what without opening each one. Search in Spotlight finds the body text but not the title.

Conference papers, preprints, and supplemental data piled together

A single project has the NeurIPS 2023 paper, the arXiv preprint that became NeurIPS 2023, the camera-ready version, four supplemental PDFs, the slide deck, and a CSV of the experimental data. They all live in the same folder named "transformers_stuff". The version that matters is anyone's guess.

Version control on draft chapters

thesis_v2.docx, thesis_v3_FINAL.docx, thesis_v3_FINAL_REAL.docx, thesis_v3_FINAL_REAL_after_advisor.docx. Plus the LaTeX builds, the Overleaf zip backups, the figures folder revision, and a coauthor sending feedback by email. Reconstructing what version was current on April 14th takes an hour of detective work.

Supplemental data lost from the paper

A reviewer asks for the supplemental data tables from a paper you submitted 14 months ago. The paper lives in Zotero, the figures live in OneDrive, the raw CSVs live in a local "scratch" folder, and the analysis notebook is on a colleague's GitHub. Pulling them together is a half-day excavation.

Project folders that drift over time

A new project starts clean: dedicated folder, subfolders for literature, drafts, data, and figures. Six months in, downloads land in Downloads, draft figures land on the Desktop, the literature subfolder has six unrelated papers a collaborator emailed. The project has multiplied across three locations.

Citation export is a separate, manual step

You read a paper, decide to cite it, dig up the BibTeX from Google Scholar or the publisher page, paste it into your .bib file, fix the citation key. Across a 200-paper lit review chapter that is hours of busywork that fragments the reading flow.

Document types Sortio recognizes

The categories that show up in active research and how Sortio routes each one.

Published papers and preprints

Sortio reads each paper for the title, authors, year, journal, DOI, and abstract, then renames to a clean Vaswani_2017_Attention-is-all-you-need.pdf pattern. The same paper as preprint and as camera-ready version are deduplicated, with the camera-ready treated as canonical and the preprint kept alongside.

Supplemental data and notebooks

CSVs, parquet files, Jupyter notebook exports, raw experimental logs. Sortio routes these under the matching paper folder so the data lives next to the publication. Future-you (or a reviewer) can find the exact dataset that produced figure 3 without crawling four directories.

Draft chapters and version control

Word documents, LaTeX projects, Overleaf zips, Google Docs exports. Sortio tags each with the modification date, the chapter or section, and (when present) the reviewer name. thesis_v3_FINAL_REAL.docx becomes 2026-04-12 - Thesis Chapter 4 - after Advisor.docx and supersedes earlier versions in a clean history.

Slide decks and posters

Conference talks, lab meeting slides, poster PDFs, AAAS presentations. Sortio recognizes presentation files and routes them under the matching project, tagged by venue and date. Pulling together slides for a recurring talk is a search instead of an excavation.

Grant applications and IRB

NSF proposals, NIH R01 applications, departmental seed grant forms, IRB protocols and amendments, conflict-of-interest disclosures. Sortio routes administrative research docs into a separate Grants and IRB folder per project, with year and submission deadline tagging.

Citation and bibliography exports

BibTeX exports, RIS files, EndNote dumps, Zotero CSL JSON. Sortio recognizes citation export formats and routes them alongside the corresponding paper. When the Zotero integration ships, these exports become a roundtrip: Sortio uploads the PDF, Zotero generates the citation key, and the BibTeX entry lands in your .bib file.

Shipping: Zotero integration

Built for Zotero

The Zotero integration is shipping in the next Sortio release, as the pilot of our wider batched integrations effort (dotloop, Buildium, Rent Manager are landing alongside). It uses Zotero's API-key auth, lists your existing collections and items, parses each PDF\'s title and DOI, enriches via Crossref where useful, and uploads as a Zotero item with the PDF attached. Collection matching mirrors how Sortio routes documents to Clio matters or property units in the other integrations.

Today you can use Sortio with Zotero as a pre-step: point it at your Downloads folder, let it rename and route the PDFs, then drag the renamed files into Zotero (which will re-parse the metadata correctly because the filenames now match the content). When the integration goes live, that drag-and-drop step disappears.

For researchers without Zotero, Sortio works as a standalone organizer. The same author-year-title rename pattern, project folder routing, and supplemental data linking works against a plain filesystem library.

Simple pricing

Solo PhD students often fit on Free for ongoing work after a one-time Pro month to clean the historical library. Labs and PIs graduate to Team for a shared rule library across students.

Free

$0

Try it on a single folder

  • 50 AI credits to start
  • Up to 50 files per sort
  • Preview before applying
  • Sort history & undo
  • Local LLM / BYOK
Most Popular

Pro

$14.99/mo or $99/yr

For PhD students, postdocs, and PIs

  • 5,000 AI credits / month
  • Author/year/title rename + Zotero sync
  • AI sort: up to 5,000 files / run
  • Auto-sort on file change
  • BYOK (no credits used)
  • Unlimited file renaming
  • Email support (48h)

Team

$29/seat/mo

Shared workflows for the team

  • Everything in Pro
  • Unlimited credits per seat
  • Shared automations & rules
  • Admin console & seats
  • Centralized LLM policy
  • Priority support (24h)

Enterprise

$50+/seat/mo

Advanced security & compliance

  • Everything in Team
  • SSO/SAML + SCIM
  • Audit logs
  • Self-hosted deployment
  • Dedicated CSM + SLA
  • Volume discounts (25+ seats)

Researcher FAQ

Questions from PhD students, postdocs, PIs, and research staff evaluating Sortio for PDF organization and Zotero workflow.

Sortio renames PDFs by content (author, year, title), routes them into project folders, and is shipping a native Zotero integration in the next release. Once live, Sortio will upload each file directly into the matching Zotero collection as an item with attachment, with Crossref-enriched metadata and a generated citation key. Until then, use Sortio as a pre-step that renames downloaded PDFs cleanly before you drag them into Zotero.

Sortio reads each PDF for the title page, first author, year, journal, and DOI. ssrn-id3942817.pdf becomes Vaswani_2017_Attention-is-all-you-need.pdf. The exact pattern is configurable: Author_Year_TitleShort.pdf, YEAR - AUTHOR - TITLE.pdf, or your lab's own convention via the Rule Builder. For sparse-metadata old OCR'd scans, Sortio enriches via Crossref so the rename still works.

Not unless you choose the cloud classifier. In Ollama mode no document content (and no filenames) crosses the network. In BYOK mode requests go from your machine directly to your provider on your account, never through Sortio. With the default Sortio classifier, file content stays local and only filenames plus extracted metadata are sent for matching, redacted from logs after 30 days. For unpublished or IRB-sensitive work, use Ollama mode.

Yes. Sortio is filesystem-transparent: it organizes the underlying PDFs and writes clean filenames, which Mendeley and EndNote will then re-parse correctly when you add them to their libraries. Native API integration is Zotero-first because Zotero has the most open API and the largest researcher base, but Mendeley and EndNote users get the rename-and-route benefit through the standard Sortio workflow.

Sortio recognizes the same paper as preprint (arXiv, SSRN, bioRxiv) and as camera-ready (NeurIPS, Nature, ACL) by matching titles and authors. The camera-ready version is treated as canonical and the preprint is kept alongside in a preprints subfolder, both renamed consistently. The Rule Builder lets you decide which version your bibliography prefers and routes citations accordingly.

Yes. CSVs, parquet files, Jupyter notebook exports, raw experimental logs are routed under the matching paper folder. The pattern is Paper / data / and Paper / notebooks / so future you (or a reviewer six months from now) can find the exact dataset that produced a figure without crawling a separate scratch directory.

Sortio tags each Word, LaTeX, or text document with the modification date, the chapter or section name extracted from the file, and the reviewer or coauthor name if it appears in the filename. thesis_v3_FINAL_REAL.docx becomes 2026-04-12 - Thesis Chapter 4 - after Advisor.docx with earlier versions moved to a superseded subfolder. Sortio does not replace Git for the actual diff, but it makes the version timeline visible.

Sortio does not currently generate BibTeX directly; it organizes files and metadata. Zotero is your citation generator. When the Zotero integration ships, each file Sortio uploads triggers Zotero's metadata extraction and citation key generation, which you can then export as BibTeX, RIS, or CSL JSON from Zotero in the usual way.

Yes. Create a rule on one machine, export it as a JSON file, and import it on each lab member's install. Team plan ($29 per seat per month) adds a shared rule library so updates propagate automatically. The lab keeps a single naming convention even as new students join.

For a one-time historical sort, use the Rule Builder. Rule-based sorts process unlimited files instantly at no AI credit cost. We typically see a 5,000-paper library renamed and routed in under an hour on a modern Mac. New papers landing afterward in Downloads can use AI Sort (which is more flexible but credit-metered) or be funneled through the same rule.

Yes. Sortio ships for macOS, Windows, and Linux. Researchers on Linux compute clusters can run Sortio against shared scratch space or home directories. The Free tier is sufficient to test the workflow on a single project folder before committing.

Free tier: 50 AI credits and up to 50 files per sort, enough to try one project folder. Pro: $14.99 per month or $99 per year (about $8 per month annualized), includes 5,000 AI credits and the Zotero integration. Many PhD students fit under the Free tier for ongoing work after a one-time Pro month to clean their historical library.

Request a Demo

Tell us about your lab or project and we will show you Sortio working with a sample of your PDF library.

By submitting, you agree to be contacted about Sortio. We respect your privacy.