Talend Component Best Practices

Organize your code

Few recommandations apply to the way a component packages are organized:

  1. ensure to create a package-info.java with the component family/categories at the root of your component package:

@Components(family = "jdbc", categories = "Database")
package org.talend.sdk.component.jdbc;

import org.talend.sdk.component.api.component.Components;
  1. create a package for the configuration

  2. create a package for the actions

  3. create a package for the component and one subpackage by type of component (input, output, processors, …​)

Modelize your configuration

It is recommanded to ensure your configuration is serializable since it is likely you will pass it through your components which can be serialized.

I/O configuration

The first step to build a component is to identify the way it must be configured.

It is generally split into two main big concepts:

  1. the DataStore which is the way you can access the backend

  2. the DataSet which is the way you interact with the backend

Here are some examples to let you get an idea of what you put in each categories:

Example description DataStore DataSet

Accessing a relational database like MySQL

the JDBC driver, url, username and password

the query to execute, row mapper, …​

Access a file system

the file pattern (or directory + file extension/prefix/…​)

the file format, potentially the buffer size, …​

It is common to make the dataset including the datastore since both are required to work. However it is recommanded to replace this pattern by composing both in a higher level configuration model:

@DataSet
public class MyDataSet {
    // ...
}

@DataStore
public class MyDataStore {
    // ...
}


public class MyComponentConfiguration {
    @Option
    private MyDataSet dataset;

    @Option
    private MyDataStore datastore;
}

Processor configuration

Processor configuration is simpler than I/O configuration since it is specific each time. For instance a mapper will take the mapping between the input and output model:

public class MappingConfiguration {
    @Option
    private Map<String, String> fieldsMapping;

    @Option
    private boolean ignoreCase;

    //...
}

I/O recommandations

I/O are particular because they can be linked to a set of actions. It is recommanded to wire all the ones you can apply to ensure the consumers of your component can provide a rich experience to their users.

Here are the most common ones:

Type Action Description Configuration example Action example

DataStore

@Checkable

Expose a way to ensure the datastore/connection works

@DataStore
@Checkable
public class JdbcDataStore
  implements Serializable {

  @Option
  private String driver;

  @Option
  private String url;

  @Option
  private String username;

  @Option
  private String password;
}
@HealthCheck
public HealthCheckStatus healthCheck(@Option("datastore") JdbcDataStore datastore) {
    if (!doTest(dataStore)) {
        // often add an exception message mapping or equivalent
        return new HealthCheckStatus(Status.KO, "Test failed");
    }
    return new HealthCheckStatus(Status.KO, e.getMessage());
}

I/O limitations

Until the studio integration is complete, it is recommanded to limit processors to 1 input.

Handle UI interactions

It is also recommanded to provide as much information as possible to let the UI work with the data during its edition.

Validations

Light validations

The light validations are all the validations you can execute on the client side. They are listed in the UI hint part.

This is the ones to use first before going with custom validations since they will be more efficient.

Custom validations

These ones will enforce custom code to be executed, they are more heavy so try to avoid to use them for simple validations you can do with the previous part.

Here you define an action taking some parameters needed for the validation and you link the option you want to validate to this action. Here is an example to validate a dataset. For example for our JDBC driver we could have:

// ...
public class JdbcDataStore
  implements Serializable {

  @Option
  @Validable("driver")
  private String driver;

  // ...
}

@AsyncValidation("driver")
public ValidationResult validateDriver(@Option("value") String driver) {
  if (findDriver(driver) != null) {
    return new ValidationResult(Status.OK, "Driver found");
  }
  return new ValidationResult(Status.KO, "Driver not found");
}

Note that you can also make a class validable and you can use it to validate a form if you put it on your whole configuration:

// note: some part of the API were removed for brievity

public class MyConfiguration {

  // a lot of @Options
}

public MyComponent {
    public MyComponent(@Validable("configuration") MyConfiguration config) {
        // ...
    }

    //...
}

@AsyncValidation("configuration")
public ValidationResult validateDriver(@Option("value") MyConfiguration configuration) {
  if (isValid(configuration)) {
    return new ValidationResult(Status.OK, "Configuration valid");
  }
  return new ValidationResult(Status.KO, "Driver not valid ${because ...}");
}
the parameter binding of the validation method uses the same logic than the component configuration injection. Therefore the @Option specifies the prefix to use to reference a parameter. It is recommanded to use @Option("value") until you know exactly why you don’t use it. This way the consumer can match the configuration model and just prefix it with value. to send the instance to validate.

Completion

It can be neat and user friendly to provide completion on some fields. Here an example for the available drivers:

// ...
public class JdbcDataStore
  implements Serializable {

  @Option
  @Completable("driver")
  private String driver;

  // ...
}

@Completion("driver")
public CompletionList findDrivers() {
    return new CompletionList(findDriverList());
}

Don’t forget the component representation

Each component must have its own icon:

@Icon(Icon.IconType.DB_INPUT)
@PartitionMapper(family = "jdbc", name = "input")
public class JdbcPartitionMapper
    implements Serializable {
}
you can use talend.surge.sh/icons/ to identify the one you want to use.

Version and component

Not mandatory for the first version but recommanded: enforce the version of your component.

@Version(1)
@PartitionMapper(family = "jdbc", name = "input")
public class JdbcPartitionMapper
    implements Serializable {
}

If you break a configuration entry in a later version ensure to:

  1. upgrade the version

  2. support a migration of the configuration

@Version(value = 2, migrationHandler = JdbcPartitionMapper.Migrations.class)
@PartitionMapper(family = "jdbc", name = "input")
public class JdbcPartitionMapper
    implements Serializable {

    public static class Migrations implements MigrationHandler {
        // implement your migration
    }
}

Don’t forget to test

Testing the components is crucial, you can use unit tests and simple standalone JUnit but it is highly recommanded to have a few Beam tests to ensure your component works in Big Data world.

Contribute to this guide

Don’t hesitate to send your feedback on writing component and best practices you can encounter.

Scroll to top