getscipapers is a Python package designed for searching and requesting scientific papers from multiple sources. This project is a work in progress and primarily intended for personal use. It is not a comprehensive solution for accessing scientific papers. Portions of the code were developed with assistance from GitHub Copilot.
wget https://dist.ipfs.tech/kubo/v0.35.0/kubo_v0.35.0_linux-amd64.tar.gz
tar -xvzf kubo_v0.35.0_linux-amd64.tar.gz
cd kubo
sudo ./install.sh
Verify installation:
ipfs --version
Alternatively, you can interact with the Nexus Telegram bot. To do so, create a Telegram account and obtain your API ID and API hash from my.telegram.org.
(Optional) Obtain free API keys from Elsevier, Wiley, or IEEE (IEEE support not yet implemented).
(Optional) Create accounts at Sci-Net, AbleSci, Science Hub Mutual Aid, Z-Library or Facebook to request or download papers/books. For Facebook, join the relevant group after creating your account.
It is recommended to use a virtual environment to avoid conflicts with other Python packages. You can use venv
or virtualenv
. To set up the environment and install dependencies:
# Clone the repository
git clone https://github.com/hoanganhduc/getscipapers.git
cd getscipapers
# Create and activate a virtual environment (change the path if desired)
python -m venv ~/.getscipapers
source ~/.getscipapers/bin/activate
# Upgrade pip and install dependencies
pip install --upgrade pip
pip install build
pip install -r requirements.txt
# Build and install the package in editable mode
python -m build
pip install -e .
# Clean up build artifacts
rm -rf build/ dist/ *.egg-info/
find . -type d -name __pycache__ -exec rm -rf {} +
find . -type f -name "*.pyc" -delete
To use the Nexus Search database, start the IPFS daemon (if this is your first time, run ipfs init
first) in one terminal:
ipfs daemon
In another terminal, use the getscipapers
command to search for and request scientific papers. For usage details, run:
getscipapers --help
The fastest way to run getscipapers
is via GitHub Codespaces. This provides a preconfigured environment, eliminating local setup. To use it:
Create a new codespace from your forked repository. This will automatically set up the environment with all dependencies installed. You can also use GitHub CLI to create a codespace, for example:
gh codespace create --repo hoanganhduc/getscipapers --branch master --machine basicLinux32gb
The
basicLinux32gb
machine type provides 2 cores, 8GB RAM, and 32GB storage. See GitHub Codespaces documentation for more machine types such as standardLinux32gb
, premiumLinux
, and largePremiumLinux
.
getscipapers
commands directly.
This guide explains how to use getscipapers
inside a Docker container. The container includes all dependencies, so you can start downloading scientific papers immediately—no manual setup required.
To get started quickly, pull the latest image from GitHub Container Registry and run it:
docker pull ghcr.io/hoanganhduc/getscipapers:latest
docker run -it --rm -v $(pwd):/workspace ghcr.io/hoanganhduc/getscipapers:latest
This mounts your current directory to /workspace
inside the container for easy file access.
To build the image yourself:
docker build -t getscipapers .
docker run -it --rm -v $(pwd):/workspace getscipapers
To keep the container running in the background and ensure downloads and configuration persist:
docker run -d \
--name getscipapers-container \
--restart always \
-v $HOME/Downloads:/home/getscipaper/Downloads \
-v $HOME/.config/getscipapers:/home/getscipaper/.config/getscipapers \
ghcr.io/hoanganhduc/getscipapers:latest
This setup saves downloaded papers and settings to your host machine. Adjust folder paths as needed.
To use IPFS with getscipapers, run an IPFS Kubo daemon in a separate container:
docker pull ipfs/kubo:latest
sudo ufw allow 4001
sudo ufw allow 8080
sudo ufw allow 5001
export ipfs_staging=$HOME/.ipfs
export ipfs_data=$HOME/.ipfs
docker run -d \
--name ipfs_host \
--restart always \
-v $ipfs_staging:/export \
-v $ipfs_data:/data/ipfs \
-p 4001:4001 \
-p 8080:8080 \
-p 5001:5001 \
ipfs/kubo:latest
This starts the IPFS daemon with persistent storage and required ports. Adjust folder paths as needed.
To run getscipapers
inside the container:
docker exec -it getscipapers-container getscipapers --help
For easier access, create a script at ~/.local/bin/getscipapers
:
#!/bin/bash
CONTAINER_NAME="getscipapers-container"
if [ $# -lt 1 ]; then
echo "Usage: $0 [arguments...]"
exit 1
fi
COMMAND=("getscipapers" "$@")
if ! command -v docker &> /dev/null; then
echo "Error: Docker is not installed"
exit 1
fi
if ! docker ps -q -f name="$CONTAINER_NAME" | grep -q .; then
echo "Error: Container '$CONTAINER_NAME' is not running"
exit 1
fi
docker exec -i "$CONTAINER_NAME" "${COMMAND[@]}"
Make it executable:
chmod +x ~/.local/bin/getscipapers
Now you can run getscipapers
directly from your terminal:
getscipapers --help
For more information, see the official documentation or repository.
You can run getscipapers locally using Docker without installing Python or dependencies on your system.
docker pull ghcr.io/hoanganhduc/getscipapers:latest
docker run --rm -it -v /path/to/local/dir:/data ghcr.io/hoanganhduc/getscipapers:latest --output /data
Replace /path/to/local/dir
with your preferred local directory.
This setup allows you to use getscipapers in an isolated environment, keeping your files accessible on your host machine.
StcGeck
is slow and generally best avoided, except in specific scenarios (such as when the Nexus bot is maintained). If you do not wish to use StcGeck
, do not start the IPFS Desktop App or run ipfs daemon
. In this case, the script will return errors, but StcGeck
will not be used.ablesci
, scinet
, libgen
, wosonhj
, and facebook
modules depend on Selenium and may break if the target websites change.
facebook
module may work locally but fail in GitHub Codespace or Docker containers (Docker not yet tested). Logging in from Codespace may trigger Facebook verification due to unfamiliar IP addresses. To resolve this, run the Facebook login for the first time with the --no-headless
option and use your browser via noVNC to verify your login. Subsequent logins should work without issues. The noVNC access address will look like https://<your-github-codespace-machine-name>-6080.app.github.dev
.libgen
may occasionally fail; retrying usually resolves the issue.nexus
module may not work reliably when using a proxy. Issues such as 307 Temporary Redirect
errors may occur, and downloads may fail if the Nexus Search server or Telegram bot is unavailable.