Development

Foreword

Being able to easily develop against Threatshell is my ultimate goal. Feel free to open issues or email me if something isn’t clear or isn’t working, or if python isn’t really your thing but you’d like to see something cool added to Threatshell.

Threatshell is built on top of the python cmd module’s Cmd class. It will be just as useful of a source as anything I’d write to explain how things integrate into the main command loop, like adding your def do_<function name> which you’ll see later.

For now, all you need to do to integrate your favorite analysis scripts into Threatshell is follow a few easy steps (which will change in the near future). Before jumping into the steps, there are a couple of prerequisites:

Prerequisites

  • make your script a module containing a class that has all the functions you want to put in Threatshell

  • decide if you want to store your data in elasticsearch or not
    • This will be expanded later to include additional output sinks

Integration

Threatshell commands are currently split up into two parts – the functions that go in the command module, and the functions that use the functions that go in the command module. Sounds confusing, I know, but let’s look at an example. Let’s pretend we have a script named example.py and we want to make it a Threatshell module with a corresponding elasitcsearch document definition, and for extra awesome sauce, we want our example function to run with the magic “q” function.

threatshell.commands.example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
 # here is the example.py module that goes in threatshell.commands
 from threatshell.commands.q import AutoQuery  # for q function
 from threatshell.doctypes import example as example_docs # we'll get to this

 import logging
 import requests

 log = logging.getLogger(__name__)


 class Example:

     # If you have sensitive information like API keys, user/pass, etc.
     # they will go into the Threatshell config which can then be passed
     # to your module's init function. No need to worry about the security
     # of your keys - that's already been taken care of
     def __init__(self, config):
         self.url = "http://some.api.com"
         self.user = config.get("Example", "user")
         self.passwd = config.get("Example", "pass")


     # This will set our example.request_intel() function to be used on
     # q --domain dom1 dom2 domN
     #
     # If the function works for multiple types of indicators, simply add
     # them to the array passed to the use_on decorator
     # (e.g. ["domain", "ip", "url"])
     #
     # Reference the threatshell.commands.q module for the list of supported
     # auto-query types, or add your own type to it
     #
     @AutoQuery.use_on(["domain"])
     def request_intel(self, domains):

         # This part is important - if you want this funtion to use q
         # correctly, it will need to be able to iterate over a list of
         # queries or it will need to be able to process the list all at
         # once (like with a bulk API). Plus it's nice to be able to do
         # multiple lookups in one command
         if not isinstance(domains, list):
             domains = [domains]

         docs = []
         for domain in domains:
             url = "%s/%s" % (self.url, domain)
             resp = requests.get(url, auth=(self.user, self.passwd))

             # assuming things went well with the request

             doc = example_docs.ExampleIntelDoc(resp.json())
             docs.append(doc)

         return docs

For the elasticsearch document definition, all you have to do is create a simple class containing a mapping of what each key–value pair is and how it should be handled by elasticsearch. So, let’s say our magic intel API we use in threatshell.commands.example returns a json document that looks like this

{
    "domain": "somedomain.com",
    "malicious": false,
    "contact": "someone@somedomain.com",
    "timestamp": "2016-04-20 12:00:00"
}

A simple, short, sweet json document. To turn that document into a Threatshell elasticsearch document you would create a class that looks like the following

threatshell.doctypes.example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
 #
 # GenericDoc is a base class I created to help with managing serialization,
 # Additional Threatshell fields, and other stuff coming in the future.
 # All Threatshell elasticsearch doctypes should extend this class.
 #
 # There are also some analyzers, filters, etc. in this module which you
 # may find helpful but the other really key thing to include is the
 # ThreatshellIndex decorator. You'll see why shortly.
 #
 from threatshell.doctypes.generic import (
     GenericDoc,
     email_analyzer,
     ThreatshellIndex
 )
 from elasticsearch_dsl import (
     String,
     Boolean
 )

 #
 # Here's that magical decorator. It's important because of how the
 # elasticsearch_dsl module works. When Threatshell starts, all of the
 # document decorators are collected and their respective mappings and
 # settings are sent to whatever elasticsearch servers are in your config
 #
 # In short, no decorator == no ES mapping. This doesn't guarantee a failure,
 # oh no, it's much more sneaky than that. You'll get partial docs generated
 # by the dynamic mapping and fields could end up being improperly configured
 # or behave unexpectedly when searching and whatnot.
 #
 @ThreatshellIndex.doc_type
 class ExampleIntelDoc(GenericDoc):

     #
     # This is to set the name of this document type in
     # elasticsearch
     #
     # (e.g. http://localhost:9200/threatshell/example_intel_doc/{doc_id})
     #
     class Meta:
         doc_type = "example_intel_doc"

     # Notice that these are defined just like in the json. That's because
     # of how the elasticsearch_dsl DocType class works with how it manages
     # attributes.

     domain = String()
     malicious = Boolean()
     contact = String(analyzer=email_analyzer)
     timestamp = Date()

     def __init__(self, jdata={}):
         GenericDoc.__init__(self)
         for k, v in jdata.items():

             # elasticsearch_dsl non-sense
             if v is None:
                 v = {}

             setattr(self, k, v)

         # Notice how we directly bind the json doc to this
         # class instance with setattr. You can achieve the
         # same effect with self.<attr> = <value> or with
         # ExampleIntelDoc().<attr> = <value>

That’s about all you need to get started with elasticsearch_dsl documents. It can get a bit challenging to define some of these docs. The better you define them though, the better ES can do at indexing and searching, which means you can get better correlation tips and stuff when the web interface side of Threatshell gets built out more. So do try to do a good job, and, as always, feel free to ask for help. You can also look at other doctypes I have made to see how more complicated things were achieved. The elasticsearch_dsl ReadTheDocs is also a good place for learning tricks to defining documents

That is almost all you need to know to get a Threatshell module working. All that needs to be done now is to open up threatshell.py and add a few lines of code use the command module. To do so, follow these easy steps:

  • open threatshell.py and add the import statement for your module
from threatshell.commands.example import Example
  • add the instantiation of your module to the MyPrompt.__init__
    • Don’t forget to pass it the config if you need it
class MyPrompt(Cmd):

    def __init__(self, args):

        ...

        self.example_api = Example()
  • add a method that follows the name schema of “do_<command name>” which uses the function in your example module
def do_exmpl_request_intel(self, cmd_args):
    """
    This is a doc string which becomes the help string
    for this function in the CLI. So now you can type
    "help exmpl_request_intel" and see this doc string
    when you are using the shell
    """

    # do argument parsing config

    split_args = shlex.split(cmd_args)
    args = parser.parse_args(args=split_args)
    docs = self.example_api.request_intel(args.domains)

    # Currently a function to handle output and ES document saving
    # but will be changed later to incorporate connectors to
    # other things and output formatting modules
    self._handle_response(docs)