A Model Context Protocol (MCP) server that provides access to UniProt protein information. This server allows AI assistants to fetch protein function and sequence information directly from UniProt.
UniProt MCP Server
A Model Context Protocol (MCP) server that provides access to UniProt protein information. This server allows AI assistants to fetch protein function and sequence information directly from UniProt.
Features
- Get protein information by UniProt accession number
- Batch retrieval of multiple proteins
- Caching for improved performance (24-hour TTL)
- Error handling and logging
- Information includes:
- Protein name
- Function description
- Full sequence
- Sequence length
- Organism
Quick Start
- Ensure you have Python 3.10 or higher installed
- Clone this repository:
git clone https://github.com/TakumiY235/uniprot-mcp-server.git cd uniprot-mcp-server
- Install dependencies:
# Using uv (recommended) uv pip install -r requirements.txt # Or using pip pip install -r requirements.txt
Configuration
Add to your Claude Desktop config file:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"uniprot": {
"command": "uv",
"args": ["--directory", "path/to/uniprot-mcp-server", "run", "uniprot-mcp-server"]
}
}
}
Usage Examples
After configuring the server in Claude Desktop, you can ask questions like:
Can you get the protein information for UniProt accession number P98160?
For batch queries:
Can you get and compare the protein information for both P04637 and P02747?
API Reference
Tools
-
get_protein_info
- Get information for a single protein
- Required parameter:
accession
(UniProt accession number) - Example response:
{ "accession": "P12345", "protein_name": "Example protein", "function": ["Description of protein function"], "sequence": "MLTVX...", "length": 123, "organism": "Homo sapiens" }
-
get_batch_protein_info
- Get information for multiple proteins
- Required parameter:
accessions
(array of UniProt accession numbers) - Returns an array of protein information objects
Development
Setting up development environment
- Clone the repository
- Create a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install development dependencies:
pip install -e ".[dev]"
Running tests
pytest
Code style
This project uses:
- Black for code formatting
- isort for import sorting
- flake8 for linting
- mypy for type checking
- bandit for security checks
- safety for dependency vulnerability checks
Run all checks:
black .
isort .
flake8 .
mypy .
bandit -r src/
safety check
Technical Details
- Built using the MCP Python SDK
- Uses httpx for async HTTP requests
- Implements caching with 24-hour TTL using an OrderedDict-based cache
- Handles rate limiting and retries
- Provides detailed error messages
Error Handling
The server handles various error scenarios:
- Invalid accession numbers (404 responses)
- API connection issues (network errors)
- Rate limiting (429 responses)
- Malformed responses (JSON parsing errors)
- Cache management (TTL and size limits)
Contributing
We welcome contributions! Please feel free to submit a Pull Request. Here's how you can contribute:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Please make sure to update tests as appropriate and adhere to the existing coding style.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- UniProt for providing the protein data API
- Anthropic for the Model Context Protocol specification
- Contributors who help improve this project
Features
Retrieval
Batch
Caching
Handling
Logging
Category
Knowledge & Memory