summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJan Dittberner <jandd@cacert.org>2020-12-29 12:31:30 +0100
committerJan Dittberner <jandd@cacert.org>2020-12-29 12:31:30 +0100
commitddfc693f1611c3e6303dd8068228e4c9905e64a5 (patch)
tree8f32494e8cf458060b5a310951b6e621f56e5bb8
parent2dda5a9f0daed4a0e322aa8d01efe45c1962e3cc (diff)
downloadcacert-codedocs-ddfc693f1611c3e6303dd8068228e4c9905e64a5.tar.gz
cacert-codedocs-ddfc693f1611c3e6303dd8068228e4c9905e64a5.tar.xz
cacert-codedocs-ddfc693f1611c3e6303dd8068228e4c9905e64a5.zip
Document review findings
-rw-r--r--source/general.rst172
1 files changed, 171 insertions, 1 deletions
diff --git a/source/general.rst b/source/general.rst
index 908a017..333a2ab 100644
--- a/source/general.rst
+++ b/source/general.rst
@@ -18,9 +18,179 @@ implemented in PHP.
Code structure
==============
+The code is structured into several directories. PHP namespaces are not used
+because the code predates that feature.
+
+There are a few entry point scripts in the :doc:`www directory <DIR-www>`. These
+include scripts from the :doc:`includes directory <DIR-includes>` and the
+:doc:`pages directory <DIR-pages>`.
+
+Variable and file naming
+========================
+
+Unfortunately the scripts included from the entry point scripts are named with
+numbers instead of real file names which makes it hard to understand the request
+flow. Variable naming is inconsistent. There are some variables that hide the
+meaning and require a lot of code reading / tracing to reveal their purpose.
+
Comments and inline documentation
=================================
The code base is not documented in a good way, there are neither class nor
method or function comments. Comments are just used for the license header
-in most of the files. \ No newline at end of file
+in most of the files.
+
+Web application
+===============
+
+URL structure
+-------------
+
+The main web application uses a URL structure with a few entry point PHP scripts
+that include other PHP scripts. URL parameters trigger the inclusion of PHP
+fragments from other files.
+
+Security functionality
+----------------------
+
+The technical side of the web application (i.e. the web server configuration) is
+sufficient but not state of the art. HSTS is used but other configurations like
+the TLS protocol support is lacking.
+
+There is no CSRF protection implemented in the application which allows cross
+site request forgery attacks against the users.
+
+The application uses string concatenation to build SQL statements which makes
+it very likely that the code can be abused for SQL injection attacks. Instead
+of using prepared statements there are a lot of attempts to sanitize input. This
+is obviously not sustainable.
+
+HTML / CSS / Javascript
+-----------------------
+
+The HTML code tries to be HTML 4.01 compatible but the code structure makes it
+hard to verify. There are formatting instructions sprinkled in the code and
+leads to a lack of proper content / formatting separation. The code uses only
+a small amount of JavaScript.
+
+The code uses some obsolete constructs like ActiveX controls and the keygen tag
+which need to be replaced.
+
+Signer / CommModule
+===================
+
+The signer is a custom set of software components implemented in Perl. The
+implementation is brittle and cannot properly cope with broken input. All
+input that is not understood by the application leads to crashes. The
+countermeasure is the ``commdaemon`` script that takes care of restarting crashed
+signer components.
+
+The signer ``server.pl`` and the signer client ``client.pl`` use external tools
+(openssl, gpg, xdelta) and rely heavily on a stable set of parameters. This
+poses risks to implement operating system package updates.
+
+The :doc:`signer protocol <signer>` is a mix of binary and text protocol. The
+protocol includes a rudimentary 8 bit CRC code which is not properly checked
+in responses from the signer server.
+
+An attempt to make the signer code more understandable is implemented in a
+`feature branch <https://github.com/jandd/cacert-devel/tree/signer_rework_for_devsetup>`_
+of Jan Dittberner's clone of the cacert-devel repository.
+
+Handling of certificate data
+============================
+
+Certificate data (especially Subject DN, SubjectAlternativeName extensions) are
+handled as strings which is not portable and may lead to corruption of data.
+
+The web application code adds subjectAlternativeName fields to the Subject DN
+of certificate requests for the signer. It would be a better idea to generate
+the proper ASN.1 structures (either a complete TBSCertificate or separate
+Subject DN and subjectAlternativeName extension structures) and pass these to
+the signer.
+
+Coupling of applications
+========================
+
+There is a close coupling between the data structures of the Web application,
+the signer client and (on test systems) the Test-Manager application. This has
+the consequence that changes in the database structure have to be implemented
+in all applications that use the same database tables or expect the same
+consistency guarantees.
+
+Database structure
+==================
+
+The database structure has been developed before MySQL had proper foreign key
+support. There are some tables that form relationships but theses are not
+formally expressed in the database schema definition.
+
+The database schema in the release branch (as of 2020-12-29) are incomplete.
+There has been an effort to provide a `more complete schema`_ in a recent
+`pull request <https://github.com/CAcertOrg/cacert-devel/pull/21>`_.
+
+.. _more complete schema: https://github.com/jandd/cacert-devel/tree/replace_deprecated_mysql_api/scripts/db_migrations
+
+Character set handling / Internationalization
+=============================================
+
+The web application, the CAcert automated testing system (CATS) as well as the
+signer use Latin1 / ISO-8859-1 as the only supported character set. This is a
+blocker for more international adoption.
+
+Translations are using the GNU gettext library. Translation files are pulled
+from translations.cacert.org via a Makefile in the
+:doc:`locale directory <DIR-locale>`. It might be a good idea to store a
+verified state of translations in the repository and update them during the
+release process.
+
+Testing
+=======
+
+Testing is only performed manually. There are no automated tests.
+
+Deployment and configuration
+============================
+
+The deployment process for the software requires changes to application files.
+An attempt to fix this situation has been made in the pull request for the
+`MySQL compatibility fixes`_ and the `signer rework`_ of Jan Dittberner.
+
+The application should be configurable via environment variables like a proper
+`12 factor application`_. Jan implemented a `docker-compose`_ based
+`developer setup`_ that implements the configuration in that way.
+
+.. _MySQL compatibility fixes: https://github.com/CAcertOrg/cacert-devel/pull/21
+.. _signer rework: https://github.com/jandd/cacert-devel/tree/signer_rework_for_devsetup
+.. _12 factor application: https://12factor.net/
+.. _docker-compose: https://docs.docker.com/compose/
+.. _developer setup: https://git.dittberner.info/jan/cacert-devsetup
+
+Conclusions
+===========
+
+From `my <https://jan.dittberner.info>`_ point of view after getting familiar
+with the existing code base I come to the following conclusions: The current
+software would require a complete rewrite that takes the following things into
+consideration:
+
+- UTF-8 handling for everything
+- proper ASN.1 handling
+- cleaner separation between components
+- modern web application standard with clean separation of content and
+ presentation
+- secure development practices
+- documented and automated deployment
+- continuous integration of changes to avoid long living feature branches
+- signer protocol with better binary support, strong consistency checks and
+ testability
+- signer support for requesting CA certificates and GPG public keys used for
+ signing to allow fully automated bootstrapping of the signer client and web
+ application
+- automated tests for critical functionality
+
+I suggest to use a more `modern statically typed language`_ with proper support
+for PKI idioms and some CSS, and client side (ECMA/JavaScript) frameworks to
+avoid reimplementation of essential stuff.
+
+.. _modern statically typed language: https://golang.org/ \ No newline at end of file