hedgedoc/docs/content/dev/2.0.md

# 2.0 Development Notes
This document collects notes and decisions taken during the development of HedgeDoc 2.0.
It should be converted to a properly structured documentation, but having unstructured docs
is better than having no docs.

## Supported databases
We intend to officially support and test these databases:
- SQLite (for development and smaller instances)
- PostgreSQL
- MariaDB

## Special Groups
The software provides two special groups which have no explicit users:
- `everyone` (Describing that everyone who wants to access a note can do if it is enabled in the config.)
- `loggedIn` (Describing all users which are logged in)

## Deleting notes and revisions
- The owner of a note may delete it.
    - By default, this also removes all revisions and all files that were uploaded to that note.
    - The owner may choose to skip deleting associated uploads, leaving them without a note.
    - The frontend should show a list of all uploads that will be affected
      and provide a method of skipping deletion. 
- The owner of a note may delete all revisions. This effectively purges the edit
  history of a note.

## Entity `create` methods

Because we need to have empty constructors in our entity classes for TypeORM to work, the actual constructor is a separate `create` method. These methods should adhere to these guidelines:

- Only require the non-optional properties of the corresponding entity
- Have no optional parameters
- Have no lists which can be empty (so probably most of them)
- Should either return a complete and fully useable instance or return a Pick/Omit type.
- Exceptions to these rules are allowed, if they are mentioned in the method documentation

## Auth tokens for the public API
The public API uses bearer tokens for authentication.

When a new token is requested via the private API, the backend generates a 64 bytes-long secret of
cryptographically secure data and returns it as a base64url-encoded string, along with an identifier.
That string can then be used by clients as a bearer token.

A SHA-512 hash of the secret is stored in the database. To validate tokens, the backend computes the hash of the provided
secret and checks it against the stored hash for the provided identifier.

### Choosing a hash function
Unfortunately, there does not seem to be any explicit documentation about our exact use-case.
Most docs describe classic password-saving scenarios and recommend bcrypt, scrypt or argon2.
These hashing functions are slow to stop brute-force or dictionary attacks, which would expose the original,
user-provided password, that may have been reused across multiple services.

We have a very different scenario:
Our API tokens are 64 bytes of cryptographically strong pseudorandom data.
Brute-force or dictionary attacks are therefore virtually impossible, and tokens are not reused across multiple services.
We therefore need to only guard against one scenario:
An attacker gains read-only access to the database. Saving only hashes in the database prevents the attacker
from authenticating themselves as a user. The hash-function does not need to be very slow, 
as the randomness of the original token prevents inverting the hash. The function actually needs to be reasonably fast,
as the hash must be computed on every request to the public API.
SHA-512 (or alternatively SHA3) fits this use-case.
Add intention of 2.0 development docs Signed-off-by: David Mehren <git@herrmehren.de> 2021-02-19 06:43:55 -05:00			`# 2.0 Development Notes`
			`This document collects notes and decisions taken during the development of HedgeDoc 2.0.`
			`It should be converted to a properly structured documentation, but having unstructured docs`
			`is better than having no docs.`
Log warnings when using hardcoded data. Signed-off-by: David Mehren <git@herrmehren.de> 2020-07-26 11:28:32 -04:00
Document supported databases in 2.0 Closes #447 Signed-off-by: David Mehren <git@herrmehren.de> 2021-02-19 06:50:16 -05:00			`## Supported databases`
			`We intend to officially support and test these databases:`
			`- SQLite (for development and smaller instances)`
			`- PostgreSQL`
			`- MariaDB`

Docs: Add description for special groups Signed-off-by: Yannick Bungers <git@innay.de> 2021-02-27 08:16:29 -05:00			`## Special Groups`
			`The software provides two special groups which have no explicit users:`
			- `everyone` (Describing that everyone who wants to access a note can do if it is enabled in the config.)
			- `loggedIn` (Describing all users which are logged in)
Document supported databases in 2.0 Closes #447 Signed-off-by: David Mehren <git@herrmehren.de> 2021-02-19 06:50:16 -05:00
Add dev docs about deleting notes & revisions Fixes #560 Signed-off-by: David Mehren <git@herrmehren.de> 2021-03-15 06:53:44 -04:00			`## Deleting notes and revisions`
			`- The owner of a note may delete it.`
			`- By default, this also removes all revisions and all files that were uploaded to that note.`
			`- The owner may choose to skip deleting associated uploads, leaving them without a note.`
			`- The frontend should show a list of all uploads that will be affected`
			`and provide a method of skipping deletion.`
			`- The owner of a note may delete all revisions. This effectively purges the edit`
			`history of a note.`

docs: add dev documentation for create methods Signed-off-by: Philip Molares <philip.molares@udo.edu> 2021-11-11 16:07:00 -05:00			## Entity `create` methods

			Because we need to have empty constructors in our entity classes for TypeORM to work, the actual constructor is a separate `create` method. These methods should adhere to these guidelines:

			`- Only require the non-optional properties of the corresponding entity`
			`- Have no optional parameters`
			`- Have no lists which can be empty (so probably most of them)`
			`- Should either return a complete and fully useable instance or return a Pick/Omit type.`
			`- Exceptions to these rules are allowed, if they are mentioned in the method documentation`

docs: explain the choice of sha-512 for auth tokens Signed-off-by: David Mehren <git@herrmehren.de> 2021-12-14 13:21:28 -05:00			`## Auth tokens for the public API`
			`The public API uses bearer tokens for authentication.`

			`When a new token is requested via the private API, the backend generates a 64 bytes-long secret of`
			`cryptographically secure data and returns it as a base64url-encoded string, along with an identifier.`
			`That string can then be used by clients as a bearer token.`

			`A SHA-512 hash of the secret is stored in the database. To validate tokens, the backend computes the hash of the provided`
			`secret and checks it against the stored hash for the provided identifier.`

			`### Choosing a hash function`
			`Unfortunately, there does not seem to be any explicit documentation about our exact use-case.`
			`Most docs describe classic password-saving scenarios and recommend bcrypt, scrypt or argon2.`
			`These hashing functions are slow to stop brute-force or dictionary attacks, which would expose the original,`
			`user-provided password, that may have been reused across multiple services.`

			`We have a very different scenario:`
			`Our API tokens are 64 bytes of cryptographically strong pseudorandom data.`
			`Brute-force or dictionary attacks are therefore virtually impossible, and tokens are not reused across multiple services.`
			`We therefore need to only guard against one scenario:`
			`An attacker gains read-only access to the database. Saving only hashes in the database prevents the attacker`
			`from authenticating themselves as a user. The hash-function does not need to be very slow,`
			`as the randomness of the original token prevents inverting the hash. The function actually needs to be reasonably fast,`
			`as the hash must be computed on every request to the public API.`
			`SHA-512 (or alternatively SHA3) fits this use-case.`