Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,8 @@ Various forms of documentation for SystemDS are available.
- [Builtin Functions](./site/builtins-reference) contains a collection of builtin functions providing an high level abstraction on complex machine learning algorithms.
- [Algorithm Reference](./site/algorithms-reference) contains specifics on algorithms supported in systemds.
- [Entity Resolution](./site/entity-resolution) provides a collection of customizable entity resolution primitives and pipelines.
- [Run SystemDS](./site/run) contains an Helloworld example along with an environment setup guide.
- [Quickstart Guide](./site/install) guides through setup from git download to running system.
- Instructions on python can be found at [Python Documentation](./api/python/index)
- The [JavaDOC](./api/java/index) contains internal documentation of the system source code.
- [Install from Source](./site/install) guides through setup from git download to running system.
- If you want to contribute take a look at [Contributing](https://github.com/apache/systemds/blob/main/CONTRIBUTING.md)
- [R to DML](./site/dml-vs-r-guide) walks through the basics of converting a script from R to dml.
211 changes: 31 additions & 180 deletions docs/site/install.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
layout: site
title: SystemDS Install from source
title: SystemDS Quickstart Guide
description: Quickstart guide for installing and running SystemDS on Windows, Linux, and macOS
---
<!--
{% comment %}
Expand All @@ -21,202 +22,52 @@ limitations under the License.
{% endcomment %}
-->

# Install SystemDS from Source
Welcome to the quickstart guide for Apache SystemDS. This quickstart page provides a high-level overview of both installation and points you to the detailed documentation for each path.

This guide helps in the install and setup of SystemDS from source code.
SystemDS can be installed and used in two different ways:

---

- [1. Install on Windows](#1-install-on-windows)
- [2. Install on Ubuntu](#2-install-on-ubuntu-2204--2404)
- [3. Install on macOS](#3-install-on-macos)
- [4. Build the Project](#4-build-the-project)
- [5. Run a Component Test](#5-run-a-component-test)
- [6. Next Steps](#6-next-steps)

Once the individual environment is set up, you can continue with the common build steps below.

---

# 1. Install on Windows

First setup Java and maven to compile the system note the Java version is 17, we suggest using Java OpenJDK 17.

- <https://openjdk.org/>
- <https://maven.apache.org/download.cgi?.>

Setup your environment variables with JAVA_HOME and MAVEN_HOME. Using these variables add the JAVA_HOME/bin and MAVEN_HOME/bin to the path environment variable. An example of setting it for Java can be found here: <https://www.thewindowsclub.com/set-java_home-in-windows-10>

To run the system we also have to setup some Hadoop and Spark specific libraries. These can be found in the SystemDS repository. To add this, simply take out the files, or add 'src/test/config/hadoop_bin_windows/bin' to PATH. Just like for JAVA_HOME set a HADOOP_HOME to the environment variable without the bin part, and add the `%HADOOP_HOME%/bin` to path.

On Windows, cloning large repositories via GitHub Desktop may stall in some environments. If this happens, cloning via the Git command line is a reliable alternative.
Example:
```bash
git clone https://github.com/apache/systemds.git
```

To make the build go faster set the IDE or environment variables for Java: '-Xmx16g -Xms16g -Xmn1600m'. Here set the memory to something close to max memory of the device you are using.

To start editing the files remember to import the code style formatting into the IDE, to keep the changes of the files consistent.

A suggested starting point would be to run some of the component tests from your IDE.

# 2. Install on Ubuntu (22.04 / 24.04)

### 2.1 Install Java 17 and Maven

First setup Java, maven and git to compile the system note that the Java version is 17.

```bash
sudo apt update
sudo apt install openjdk-17-jdk maven
sudo apt install -y git
```

Verify the install with:
```bash
java -version
mvn -version
git --version
```

This should return something like:
```bash
openjdk 17.x.x
Apache Maven 3.x.x
git version 2.x.x
```
1. Using a **downloaded release**
2. Using a **source build**

### 2.2 Set JAVA_HOME for Javadocs
If you are primarily a user of SystemDS, start with the Release installation. If you plan to contribute or modify internals, follow the Source installation.

Set `JAVA_HOME` (required for generating Javadocs during the Maven build):
```bash
export JAVA_HOME=$(dirname $(dirname $(readlink -f $(which javac))))
export PATH="$JAVA_HOME/bin:$PATH"
```
Each method is demonstrated in:
- Local mode
- Spark mode
- Federated mode (simple example)

### 2.3 Clone Source Code
For detailed configuration topics (BLAS, GPU, federated setup, contributing), see the links at the end.

Clone the source code:
```bash
cd /opt
git clone https://github.com/apache/systemds.git
cd systemds
```

### 2.4 Testing

R should be installed to run the test suite, since many tests are constructed to compare output with common R packages. One option to install this is to follow the guide on the following link: <https://linuxize.com/post/how-to-install-r-on-ubuntu-20-04/>

R can be installed using the CRAN repository.

**Ubuntu 22.04**

```bash
sudo apt install dirmngr gnupg apt-transport-https ca-certificates software-properties-common
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/'
sudo apt update
sudo apt install r-base
```

**Ubuntu 24.04**

```bash
sudo apt install dirmngr gnupg apt-transport-https ca-certificates software-properties-common
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu noble-cran40/'
sudo apt update
sudo apt install r-base
```

Verify the installation:
```bash
R --version
```

**Install R Dependencies for Integration Tests (Optional)** If you want to run integration tests that depend on additional R packages, install them via:
```bash
Rscript ./src/test/scripts/installDependencies.R
```

# 3. Install on MacOS

Prerequisite install homebrew on the device.

```bash
# To allow relative paths:
brew install coreutils
# To install open jdk 17.
brew install openjdk@17
# Install maven to enable compilation of SystemDS.
brew install maven
```

Then afterwards verify the install:

```bash
java --version
mvn --version
```

This should print Java version.

Note that if you have multiple __java__ versions installed then you have to change the used version to 17, on __both java and javadoc__. This is done by setting the environment variable JAVA_HOME to the install path of open JDK 17 :

```bash
export JAVA_HOME=`/usr/libexec/java_home -v 17`
```

For running all tests [r-base](https://cran.r-project.org/bin/macosx/) has to be installed as well since this is used as a secondary system to verify the correctness of our code, but it is not a requirement to enable building the project.

Optionally, you need to install the R dependencies for integration tests, like this:
(use `sudo` mode if the script couldn't write to local R library)
---

```bash
Rscript ./src/test/scripts/installDependencies.R
```
# 1. Install from a Release

# 4. Build the project
If you simply want to *use* SystemDS without modifying the source code, the recommended approach is to install SystemDS from an official Apache release.

To compile the project use in the directory of the source code:
```bash
mvn package -P distribution
```
**Full Release Installation Guide:** [Install SystemDS from a Release](release_install)

Example output:
```bash
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 31.730 s
[INFO] Finished at: 2020-06-18T11:00:29+02:00
[INFO] ------------------------------------------------------------------------
```
# 2. Install from Source

The first time you package the system it will take longer since maven will download the dependencies. But successive compiles should become faster. The runnable JAR files will appear in `target/`.
If you plan to contribute to SystemDS or need to modify its internals, you can build SystemDS from source.

### (Optional) Add SystemDS CLI to PATH
**Full Source Build Guide:** [Install SystemDS from Source](source_install)

After building SystemDS from source, you can add the `bin` directory to your
`PATH` in order to run `systemds` directly from the command line:
# 3. After Installation

```bash
export SYSTEMDS_ROOT=$(pwd)
export PATH="$SYSTEMDS_ROOT/bin:$PATH"
```
This allows you to run `systemds` from the repository root. For running the freshly built executable JAR (e.g., `target/SystemDS.jar`) on Spark, see the Spark section in [Execute SystemDS](run).
Once either installation path is completed, you can start running scripts:

# 5. Run A Component Test
- Local Mode - Run SystemDS locally
- Spark Mode - Execute scripts on Spark through `spark-submit`
- Federated Mode - Run operations on remote data using federated workers

As an example here is how to run the component matrix tests from command line via maven.
For detailed commands and examples: [Execute SystemDS](run)

```bash
mvn test -Dtest="**.component.matrix.**"
```
# 4. More Configuration

To run other tests simply specify other packages by modifying the test argument part of the command.
SystemDS provides advanced configuration options for performance tuning and specialized execution environments.

# 6. Next Steps
- GPU Support — [GPU Guide](https://apache.github.io/systemds/site/gpu)
- BLAS / Native Acceleration — [Native Backend (BLAS) Guide](native-backend)
- Federated Backend Deployment — [Federated Guide](federated-monitoring)
- Contributing to SystemDS — [Contributing Guide](https://github.com/apache/systemds/blob/main/CONTRIBUTING.md)

Now everything is setup and ready to go! For running scripts in Spark mode or experimenting with federated workers, see the Execution Guide: [Execute SystemDS](run)
75 changes: 0 additions & 75 deletions docs/site/quickstart_extended.md

This file was deleted.

4 changes: 1 addition & 3 deletions docs/site/release_install.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@ limitations under the License.
{% endcomment %}
-->

# Install SystemDS from a Release

This guide explains how to install and set up SystemDS using the pre-built release archives.

---
Expand Down Expand Up @@ -164,7 +162,7 @@ Expected output:
Hello World!
```

# 4. Install on macOS
# 4. Install on MacOS

### 4.1 Extract the Release

Expand Down
Loading