User-Guide ========== The Webinput-CSV can be started with default values and is fully usable (except of the injection in the IntelMQ pipeline queue). Most parameters for the input are available in the Frontend and are self explaining. Configuration ------------- There are three levels of parameters: - internal defaults - the configuration file - parameters set explicitly by the web interface The only parameter you should really care of is ``destination_pipeline_queue`` which is needed to submit data to IntelMQ. There is no internal default. Usual configuration parameters ------------------------------ - ``prefix``: can only be a path. Needed if the program is run not in ``/`` but a sub-path. - ``intelmq``: the parameters in this array are used to set-up the intelmq pipeline. ``destination_pipeline_*`` inside the array can be used to configure the pipeline, see the user-guide of intelmq for details. - ``destination_pipeline_queue``: It is the queue/routing key to push the messages into if the user does not check *Validate/Submit with Bots*. - ``destination_pipeline_queue_formatted``: Optional, if true, the ``destination_pipeline_queue`` is formatted like a python format string with the event as ``ev``. E.g. ``"destination_pipeline_queue": "{ev[feed.provider]}.{ev[feed.name]}"`` - ``custom_input_fields``: These fields are shown in the interface with the given default values, see also below. - ``constant_fields``: Similar to above, but not shown to the user and added to all processed events, overwriting user input. - ``required_fields``: A list of IntelMQ field names. If set (not empty list), all lines need to have these fields, otherwise they are marked as invalid and not submitted to IntelMQ. Example: ``required_fields: ["time.source", "source.ip"]`` - ``bots``: An optional directory of bot definitions with the same structure as IntelMQ's runtime configuration. If the Webinput user checks *Validate/Submit with Bots* in the Frontend, these bots will be used for data validation and submission. Needs to contain a processing chain of any number of experts and exactly one output bot. For validation, output bots are not called. For submission, only the output bot writes the data to the destination, the data is not additionally pushed to the redis queue as specified in the configuration parameter ``destination_pipeline_queue``. Example configuration: .. code:: json "bots": { "taxonomy": { "module": "intelmq.bots.experts.taxonomy.expert" }, "url": { "module": "intelmq.bots.experts.url.expert" }, "gethostbyname": { "module": "intelmq.bots.experts.gethostbyname.expert" }, "sql": { "module": "intelmq_webinput_csv.sql_output", "parameters": { "autocommit": false, "engine": "postgresql", "database": "eventdb", "host": "intelmq-database", "port": 5432, "user": "intelmq", "password": "secret", "sslmode": "allow", "table": "events" } } } Warning: If no SQL output bot is configured, and the user uses validation/submission with bots, no data will be written anywhere, neither to an IntelMQ pipeline, nor to the database! - ``allowed_event_fields``: A list of `IntelMQ Event Field Names `_ which are allowed for user input. If left empty, all fields are allowed (default). The check is applied in the frontend (field selection for columns) and backend. The check does **not** apply to constant fields and custom input fields! - ``custom_workflow_default``: If true (default: false), then the switch "Use custom workflow" in the "Data validation and submission" section is enabled by default. - ``allow_validation_override``: If true (default), the the user is allowed to submit data if the previous validation run did not succeed (not all lines/events were valid). Mailgen configuration parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``mailgen_config_file``: Optional path to the mailgen configuration file. - ``target_groups``: Configuration how the backend can query the available target groups. See below. - ``mailgen_multi_templates_enabled``: Enable the mutli-template editor. It shows all available templates, validates them during editing, allows a preview for each, with buttons for reset to the original state and the possibility to drop them from the current mailgen run. Additionally it allows the user to save the template to the disk and delete templates from the disk, resulting in a complete template editor for mailgen. If disabled, a single template input is show in the UI, which is fixed for all notifications. - ``mailgen_default_template``: Name of the template selected by default in the Webinterface Usage ----- Empty lines are always ignored. Parameters ~~~~~~~~~~ Upload ^^^^^^ - delimiter - quotechar - escapechar - skip initial space: ignore whitespace after delimiter - has header: If checked, the first line of the file will be shown in the preview, but will not be used for submission. - skip initial N lines: number of lines (*after* the header) which should be ignored for preview and submission. Preview ^^^^^^^ - timezone: The timezone will only be added if there is no timezone detected in the existing value. Used for both time.source and time.observation. - dry run: sets classification type and identifier to ``test`` Custom Input fields ''''''''''''''''''' The Custom Input fields are added to all individual events if not present already. - classification type and identifier: default values to be added to rows which do not already have these values Additional fields with default values are configurable. .. _upload-1: Upload ~~~~~~ To submit the data to intelmq click *Send*. All lines not failing will be submitted. After submission, the total number of submitted lines is given. Integration with Mailgen ------------------------ In IntelMQ-setups, which use `IntelMQ Mailgen `__ to create and deliver notifications to network owners, some additional tweaks add more value and flexibility to the system. A few things need to be considered for the setup and configuration: 1. The Database user, used by Mailgen via Webinput needs to have permission on the events table: .. code:: sql GRANT INSERT ON TABLE events TO intelmq_mailgen; GRANT INSERT ON events_id_seq TO intelmq_mailgen; 2. For OpenPGP-signatures in mailgen, the webserver user (or the user running the WSGI process) must have sufficient privileges to the gnupg home directory 1. write access on the directory itself to create temporary files 2. read access to all files in the directory Applying different bots on one-shot data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In a typical IntelMQ setup, all collectors and parsers feed the data into a consecutive queue of expert bots and finally into one or more output bots. Running different bots (or the same bots but with other parameters) may be necessary for one-shot data. The parameter ``destination_pipeline_queue`` defines where the IntelMQ Webinput injects the data into the IntelMQ pipeline. Further, setting a unique attribute in the events itself (typically in the ``extra`` or ``feed`` section) allows applying “switches” (like rail switches) in the IntelMQ pipeline, by routing the one-shot data to different bots. The configuration parameters ``constant_fields`` and ``custom_input_fields`` are ideal for achieving this. For example: .. code:: json "constant_fields": { "feed.provider": "my-organization" } If a CERT-Bund Rules expert may receive data from IntelMQ Webinput, but should ignore it, a rule similar to this example can be used: .. code:: python from intelmq_certbund_contact.rulesupport import Directive def determine_directives(context): if context.section == "destination": return feed = context.get("feed.name") if feed.startswith('oneshot-csv'): context.logger.info('Oneshot detected!') return True return In this example ``feed.name = 'oneshot-csv'`` is the ignore-criteria. Using a differing IntelMQ Mailgen ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Normally the data from the normal IntelMQ pipeline and the one-shot data end in the same database, resulting in a mix again. For sending the notifications, IntelMQ Mailgen needs to filter by the criteria again when querying the database. The user can use two different mailgen-instance, a “normal” one and one for the one-shot data. Two features are useful for this: 1. By default, intelmqcbmail loads and uses ``/etc/intelmq/intelmq-mailgen.conf``. intelmqcbmail has a command line parameter ``--config`` / ``-c`` to read alternative configuration files instead of the default ``/etc/intelmq/intelmq-mailgen.conf``. For example: .. code:: > intelmqcbmail -c /etc/intelmq/intelmq-mailgen-oneshot.conf See for more details: https://github.com/Intevation/intelmq-mailgen#user-content-configuration IntelMQ Webinput can select a different configuration file for `intelmqmail` using the `mailgen_config_file` configuration parameter. For example: .. code:: json "mailgen_config_file": "/etc/intelmq/intelmq-mailgen-oneshot.conf" 2. The mailgen configuration parameter ``additional_directive_where``, adding additional conditions to the WHERE-clause of the SQL-statement for the directives: .. code:: json "additional_directive_where": "\"template_name\" = 'qakbot_provider'" It is also possible to filter by the event’s attributes. For this purpose, the events-table will be joined automatically. .. code:: json "additional_directive_where": "events.\"feed.provider\" = 'my-organization'" Filtering by events-data decreases the performance, it is recommended to use filters on the directives only when possible. Documentation: https://github.com/Intevation/intelmq-mailgen#user-content-database-1 Using different scripts (formats) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The mailgen configuration specific to Webinput can contain a different path to other Mailgen scripts, for example: .. code:: json "script_directory": "/opt/formats/oneshot" In contrast to normal mailgen operation, webinput passes the assigned columns of the input to the script as default table format. The table format was in earlier versions of mailgen a mandatory parameter of ``context.mail_format_as_csv`` an defines which data columns the CSV attachment of the e-mail notifications contains. If the script does not by itself pass a table format to ``mail_format_as_csv``, Mailgen uses the columns which the user assigned in the Webinput user interface. Thus, the most simple mailgen script is: .. code:: python def create_notifications(context): # always create notifications, never postpone return context.mail_format_as_csv(substitutions={}) Defining CSV attachment columns ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The table format (also: format spec) defines which data fields of the entire event data will be included in the CSV attachment file in the notifications. Mailgen's behavior is described in `its documentation `_. Webinput passes the name of the columns, which are assigned by the operator, to mailgen. If the Mailgen scripts do not define any other format spec, the notifications will contain exactly the columns assigned by the operator. If the Mailgen scripts do define a format spec, they take precedence. Mailgen Templates ~~~~~~~~~~~~~~~~~ The CERT-Bund Rules expert bases its decision which Template to use solely on the event itself. Additional information can be added by the Webinput operator. With system-defined templates ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The templates are already configured on the server by the system administrator and the Webinput Operator chooses influences/chooses which template mailgen will use. Add a new input field to the Webinput Configuration like this: .. code:: json "custom_input_fields": { "extra.template_prefix": "" } A rule of the CERT-Bund Contact rules expert may look like this: .. code:: python def determine_directives(context): ... template = context.get("extra.template_prefix", "oneshot_fallback") # Remove the field context.pop("extra.template_prefix", None) add_directives_to_context(context, msm, template) return True ... def create_directive(notification_format, matter, target_group, interval, data_format): """ This method creates Directives looking like: template_name: openportmapper_provider notification_format: vulnerable-service notification_interval: 86400 data_format: openportmapper_csv_inline """ return Directive(template_name=matter + "_" + target_group, notification_format=notification_format, event_data_format=data_format, notification_interval=interval) In this example, the template will be ``$event[extra.template_prefix]_$target_group``. More complex rules can be used of course. Keep in mind that the templates files need to exist beforehand. With operator-defined templates ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Webinput operator can set the template directly in user interface with the *Template* button in the *CSV Notifications* section. If the template is not set using this field, the template is determined by mailgen’s configured formats. Starting Mailgen ~~~~~~~~~~~~~~~~ If ``mailgen_config_file`` is not set, mailgen loads the default configuration file (``'/etc/intelmq/intelmq-mailgen.conf'``). Mailgen, as always, additionally reads the user (the webserver user) configuration file (``'~/.intelmq/intelmq-mailgen.conf'``). The complete Mailgen workflow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To do the complete workflow of IntelMQ and Mailgen in the webinput: - configure all necessary IntelMQ bots - any that you wish, plus - CERT-Bund Contact Expert - CERT-Bund Rules Expert - SQL Output, with the special module ``intelmq_webinput_csv.sql_output`` - correctly configure mailgen - setup the mailgen configuration in webinput The Postgres connection user must have write access to the events and directives tables (for event insertion). Target groups ~~~~~~~~~~~~~ The target groups are a special variant of constant fields as the available values depend on the result of an SQL query to the `fody database `__ (contactdb tags) and the users can select values from multiple-choice checkboxes. The selected values are saved in the event field ``extra.target_groups``. The CERT-Bund Rules Expert’s rules can use this information to generate the correct directives. Configuring this feature works as follows: .. code:: json "target_groups": { "database": { "host": "localhost", "user": "fody", "password": "secret", "dbname": "contactdb" }, "tag_name_query": "SELECT tag_name FROM tag_name WHERE tag_name_id = 2", "tag_value_query": "SELECT tag_value FROM tag WHERE tag_name_id = 2 ORDER BY tag_value" } The first value of the ``tag_name_query`` query is used as label for the input field, e.g. *Target Group*. The values of the ``tag_value_query`` define the possible input values for the multiple-choice checkboxes.