A Node Controller (NC) in AsterixDB has two or more network threads running all the time. A thread that manages the interaction with the Cluster Controller (Highest Priority) and a set of threads to manage NC to NC communication (Normal Priority). In this document, we describe how each of these threads work

  1. NC to NC Networking thread
    1. The thread waits for notifications from either its host (when initiating a connection) or from Java NIO network channel.
    2. Each time it is awaken, it starts by initiating connection requests. Once a connection is established, a TCPConnection object is created, the interest set for the key is set to (0: NO_OPS), and connectionEstablished is called on the connection listener which will associate the channel with the multiplexed connection. Each multiplexed connection has its own ChannelControlBlock which has a unique id, a read interface and a write interface.
    3. For each incoming connection, a TCP connection object is created, an acceptedConnection is called on the connection listener.
    4. For each channel that is ready, we determine if it is readable or writable, we notify its registered event listener that IO is ready. The listener in turn will start by attempting a read calling driveReaderStateMachine() followed by an attempt to write calling driveWriterStateMachine(). These two methods perform the logic of each connection. If a different logic is desired, we can either modify the implementation of these methods or create a different event listener that will behave differently on IO ready operations.
      1. Drive Reader State Machine
        1. Read from input buffer
        2. Update network counters
        3. If the input buffer is not full (the input received is less than a command size = 8 bytes), then we wait until more IO is ready.
        4. Read the command header in the buffer. The command can be:
          1. Add Writer Credits
          2. Close Channel
          3. Close Channel Acknowledgement
          4. Data frame
          5. Report Error
          6. Open Channel
        5. Based on the received command, the channel control block specifies how much data must be read. It attempts to read the data and flush it to the consumer. If not all the needed data was available, the control block keeps a copy of what has been read and wait for its turn to read the remaining before pushing the buffer to the consumer.
          Note: the read step is a blocking step and can cause the whole network to stop completely if the consumer of the buffers can't keep up with the rate of flow. In order to fix this, we need to switch to an asynchronous behavior similar to ConcurrentFramePool.
        6. Once data has been pushed to the consumer, the control block is reset and an attempt to read the next chunk of data is made. If no data was available, it returns to the network thread and waits for its next turn.
      2. Drive Writer State Machine
        1. If the current channel has pending data, the writer attempts to write the data to the channel. If it is unable to do that (due to network congestion for example), the call return
        2. If there are no more pending bytes, the writer loops over all the channels associated with the connection
          1. If the channel is a new channel, the writer sends an open channel command
          2. If the channel has pending credit, it sends an add credits command
          3. If the channel has pending close acknowledgement, it sends the acknowledgements
          4. If the channel has pending data bytes, it sends the data bytes
  2. NC to CC Networking thread
    1. To be continued...

 

FAQ

  1. What triggers a network connection request?
  2. I need to create a connection from an NC to another NC, what steps should I take?
    1. Get a reference to the NetworkManager instance in the NC
    2. Get the address you want to connect to
    3. Call NetworkManager connect method which will return a Channel Control Block
    4. Use the Channel Control Block to read and write to the channel.
    5. Close the writer interface when done.
    Note that you still need to change the event listener behavior if you're interested in addressing different commands beside the ones listed above 

 

 

  • No labels