Getting Started

Introduction

The crawler.dev API is a universal text extraction API that helps developers extract text from any kind of file or webpage. Whether you're building data processing pipelines, working with LLMs/AI, our API makes it easy to get the text you need.

What can you extract?

Our API supports a wide range of content sources:

Documents: .pdf, .doc, .ppt, .xls, and many other formats
Webpages: Extract text from live webpages (even non-html webpages like .pdf, .csv, etc.)
Clean output: Extract clean plain-text or markdown

Key Features

✅ Multiple Formats: Support for various document types and web content
✅ Clean Text: Advanced parsing removes formatting and extracts meaningful content
✅ Simple API: REST endpoints & SDKs that are easy to integrate
✅ Scalable: Handle everything from single requests to high-volume processing

Quick Example

Here's how simple it is to extract text from a document:

Code
 
curl -X POST https://api.crawler.dev/v1/extract/file \
  -H "x-api-key: YOUR_API_KEY" \
  -F "file=@document.pdf"


Code
 
{
  "filename": "document.pdf",
  "contentType": "application/pdf",
  "sizeBytes": 1024,
  "text": "This is the extracted text content..."
}

Getting Started

Ready to start extracting text? Here's what you need to do:

Get an API Key - Sign up and get your API credentials
Try the Quick Start - Make your first API call in minutes
Explore the API - See all available endpoints and examples
Check Examples - Real-world integration examples

Need Help?

📖 Documentation: Complete API reference and guides
💬 Support: Get help with integration questions
🐛 Issues: Report bugs or request features

Let's get started! 🚀

Authentication

Getting Started

Introduction

What can you extract?

Our API supports a wide range of content sources:

Documents: .pdf, .doc, .ppt, .xls, and many other formats
Webpages: Extract text from live webpages (even non-html webpages like .pdf, .csv, etc.)
Clean output: Extract clean plain-text or markdown

Key Features

✅ Multiple Formats: Support for various document types and web content
✅ Clean Text: Advanced parsing removes formatting and extracts meaningful content
✅ Simple API: REST endpoints & SDKs that are easy to integrate
✅ Scalable: Handle everything from single requests to high-volume processing

Quick Example

Here's how simple it is to extract text from a document:

Code
 
curl -X POST https://api.crawler.dev/v1/extract/file \
  -H "x-api-key: YOUR_API_KEY" \
  -F "file=@document.pdf"


Code
 
{
  "filename": "document.pdf",
  "contentType": "application/pdf",
  "sizeBytes": 1024,
  "text": "This is the extracted text content..."
}

Getting Started

Ready to start extracting text? Here's what you need to do:

Get an API Key - Sign up and get your API credentials
Try the Quick Start - Make your first API call in minutes
Explore the API - See all available endpoints and examples
Check Examples - Real-world integration examples

Need Help?

📖 Documentation: Complete API reference and guides
💬 Support: Get help with integration questions
🐛 Issues: Report bugs or request features

Let's get started! 🚀

Authentication