|author||Jan Dittberner <email@example.com>||2020-12-29 12:31:30 +0100|
|committer||Jan Dittberner <firstname.lastname@example.org>||2020-12-29 12:31:30 +0100|
Document review findings
1 files changed, 171 insertions, 1 deletions
diff --git a/source/general.rst b/source/general.rst
index 908a017..333a2ab 100644
@@ -18,9 +18,179 @@ implemented in PHP.
+The code is structured into several directories. PHP namespaces are not used
+because the code predates that feature.
+There are a few entry point scripts in the :doc:`www directory <DIR-www>`. These
+include scripts from the :doc:`includes directory <DIR-includes>` and the
+:doc:`pages directory <DIR-pages>`.
+Variable and file naming
+Unfortunately the scripts included from the entry point scripts are named with
+numbers instead of real file names which makes it hard to understand the request
+flow. Variable naming is inconsistent. There are some variables that hide the
+meaning and require a lot of code reading / tracing to reveal their purpose.
Comments and inline documentation
The code base is not documented in a good way, there are neither class nor
method or function comments. Comments are just used for the license header
-in most of the files. \ No newline at end of file
+in most of the files.
+The main web application uses a URL structure with a few entry point PHP scripts
+that include other PHP scripts. URL parameters trigger the inclusion of PHP
+fragments from other files.
+The technical side of the web application (i.e. the web server configuration) is
+sufficient but not state of the art. HSTS is used but other configurations like
+the TLS protocol support is lacking.
+There is no CSRF protection implemented in the application which allows cross
+site request forgery attacks against the users.
+The application uses string concatenation to build SQL statements which makes
+it very likely that the code can be abused for SQL injection attacks. Instead
+of using prepared statements there are a lot of attempts to sanitize input. This
+is obviously not sustainable.
+The HTML code tries to be HTML 4.01 compatible but the code structure makes it
+hard to verify. There are formatting instructions sprinkled in the code and
+leads to a lack of proper content / formatting separation. The code uses only
+The code uses some obsolete constructs like ActiveX controls and the keygen tag
+which need to be replaced.
+Signer / CommModule
+The signer is a custom set of software components implemented in Perl. The
+implementation is brittle and cannot properly cope with broken input. All
+input that is not understood by the application leads to crashes. The
+countermeasure is the ``commdaemon`` script that takes care of restarting crashed
+The signer ``server.pl`` and the signer client ``client.pl`` use external tools
+(openssl, gpg, xdelta) and rely heavily on a stable set of parameters. This
+poses risks to implement operating system package updates.
+The :doc:`signer protocol <signer>` is a mix of binary and text protocol. The
+protocol includes a rudimentary 8 bit CRC code which is not properly checked
+in responses from the signer server.
+An attempt to make the signer code more understandable is implemented in a
+`feature branch <https://github.com/jandd/cacert-devel/tree/signer_rework_for_devsetup>`_
+of Jan Dittberner's clone of the cacert-devel repository.
+Handling of certificate data
+Certificate data (especially Subject DN, SubjectAlternativeName extensions) are
+handled as strings which is not portable and may lead to corruption of data.
+The web application code adds subjectAlternativeName fields to the Subject DN
+of certificate requests for the signer. It would be a better idea to generate
+the proper ASN.1 structures (either a complete TBSCertificate or separate
+Subject DN and subjectAlternativeName extension structures) and pass these to
+Coupling of applications
+There is a close coupling between the data structures of the Web application,
+the signer client and (on test systems) the Test-Manager application. This has
+the consequence that changes in the database structure have to be implemented
+in all applications that use the same database tables or expect the same
+The database structure has been developed before MySQL had proper foreign key
+support. There are some tables that form relationships but theses are not
+formally expressed in the database schema definition.
+The database schema in the release branch (as of 2020-12-29) are incomplete.
+There has been an effort to provide a `more complete schema`_ in a recent
+`pull request <https://github.com/CAcertOrg/cacert-devel/pull/21>`_.
+.. _more complete schema: https://github.com/jandd/cacert-devel/tree/replace_deprecated_mysql_api/scripts/db_migrations
+Character set handling / Internationalization
+The web application, the CAcert automated testing system (CATS) as well as the
+signer use Latin1 / ISO-8859-1 as the only supported character set. This is a
+blocker for more international adoption.
+Translations are using the GNU gettext library. Translation files are pulled
+from translations.cacert.org via a Makefile in the
+:doc:`locale directory <DIR-locale>`. It might be a good idea to store a
+verified state of translations in the repository and update them during the
+Testing is only performed manually. There are no automated tests.
+Deployment and configuration
+The deployment process for the software requires changes to application files.
+An attempt to fix this situation has been made in the pull request for the
+`MySQL compatibility fixes`_ and the `signer rework`_ of Jan Dittberner.
+The application should be configurable via environment variables like a proper
+`12 factor application`_. Jan implemented a `docker-compose`_ based
+`developer setup`_ that implements the configuration in that way.
+.. _MySQL compatibility fixes: https://github.com/CAcertOrg/cacert-devel/pull/21
+.. _signer rework: https://github.com/jandd/cacert-devel/tree/signer_rework_for_devsetup
+.. _12 factor application: https://12factor.net/
+.. _docker-compose: https://docs.docker.com/compose/
+.. _developer setup: https://git.dittberner.info/jan/cacert-devsetup
+From `my <https://jan.dittberner.info>`_ point of view after getting familiar
+with the existing code base I come to the following conclusions: The current
+software would require a complete rewrite that takes the following things into
+- UTF-8 handling for everything
+- proper ASN.1 handling
+- cleaner separation between components
+- modern web application standard with clean separation of content and
+- secure development practices
+- documented and automated deployment
+- continuous integration of changes to avoid long living feature branches
+- signer protocol with better binary support, strong consistency checks and
+- signer support for requesting CA certificates and GPG public keys used for
+ signing to allow fully automated bootstrapping of the signer client and web
+- automated tests for critical functionality
+I suggest to use a more `modern statically typed language`_ with proper support
+avoid reimplementation of essential stuff.
+.. _modern statically typed language: https://golang.org/ \ No newline at end of file