Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple load balancing strategies#1361

Merged
merged 77 commits into from
Jan 14, 2018
Merged

Multiple load balancing strategies #1361

merged 77 commits into from
Jan 14, 2018

Conversation

zlm2012
Copy link
Contributor

@zlm2012 zlm2012 commented Jul 9, 2017

In this PR, several optimizations are done for load balancing:

  • Now load balancing will run each time starting a new request, instead of run when new socket needed to be created. t/50reverse-proxy-round-robin.t has been modified to prove it.
  • Added h2o_socketpool_target_status_t for recording requests on the fly & returned socket for the specific target. Added h2o_socketpool_target_status_vector_t as an argument for load balancing selector function so balancer can get status of each target.
  • Added void *data_for_balancer in h2o_socketpool_target_t so proxy handler can add more information for load balancing each target.
  • Moved round robin related code outside lib/common/socketpool.c.
  • Modified function h2o_socketpool_init_by_targets() so proxy handler can designate the balancer when initializing socketpool now.

@kazuho
Copy link
Member

kazuho commented Jul 10, 2017

Thank you for the PR. This would be a nice step to improve the proxying capability of H2O.

Regarding the design, I see two issues.

I would prefer it if each balancer backend was able to define it's own structure of pooled sockets, instead of being enforced to use a linked-list of pools. In case of the round-robin selector, the issue is restraining us to using an O(N) method in h2o_balancer_rr_selector.

I believe that locks should be defined by each balancer backend (if necessary). Or to clarify, is it possible to remove h2o_socket_pool_t::_shared.mutex?

@zlm2012
Copy link
Contributor Author

zlm2012 commented Jul 10, 2017

For the first issue, pooled sockets are divided from the old h2o_socketpool_t::_shared.sockets into socket list in h2o_socketpool_target_status_t for each target depending on the socket remote peer. And a vector for status of each target is located in h2o_socketpool_t::_shared. The selector of load balancer does the same as before. The difference from before is we run the selector then consider using pooled sockets or create a new one now. So I think h2o_balancer_rr_selector would not use O(N) method.

For the second issue, I was actually planning to do so. But when I considered about destroy_expired, we need to lock all the mutex each time when recycling timeouted sockets. I was not sure if benefits from defining locks each backend could be great enough to neutralize that so I remained it there.

@zlm2012
Copy link
Contributor Author

zlm2012 commented Jul 10, 2017

And I just realized that we also need to add an approach for balancer to get extra information per request.

@zlm2012
Copy link
Contributor Author

zlm2012 commented Aug 11, 2017

Looks like this PR has been finally completed. Some new feature has been appended:

  • per request extra data could be passed via override->req_extra.
  • least connection and weighted round robin supports are added, where weighted round robin is combined with normal round robin.
  • support per-pathconf balancer configuration.

Now value of proxy.reverse.backends should be either a scalar or a sequence, where the elements of the sequence could be either a scalar or a mapping with at least an entry with key url. For example:

proxy.reverse.backends:
  - url: https://your.treasure.net
    weight: 2
  - http://local.your.treasure.net

proxy.reverse.balancer is added for configure which balancer is to be used and some overall configuration of the balancer. Value should be either a scalar or a mapping, where at least one entry with key type must exists if using mapping. For example, usage of the directive could either be

proxy.reverse.balancer: least-conn

or

proxy.reverse.balancer:
  - type: round-robin
    other-key: other value (though currently no overall configurations for round robin)

Currently we only support least-conn and round-robin for proxy.reverse.balancer.

Integration test has been updated so weighted round robin is also been tested. For least connection, there's only a common load balancer integration test though. I cannot come up with a good solution for testing least connection.

@@ -265,11 +266,131 @@ static int on_config_ssl_session_cache(h2o_configurator_command_t *cmd, h2o_conf
return 0;
}

static int parse_balancer(h2o_configurator_command_t *cmd, yoml_t *node, h2o_socketpool_target_t **targets, size_t target_len)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to define this as a separate function now?

I believe that none of the balancers being defined take an argument (except the per-target weight). Therefore, the type of the balancer attribute could be restricted to YOML_TYPE_SCALAR. All we need to do is to call strcasecmp to determine the correct balancer constructor.

}

if (strcmp(lb_type_node->data.scalar, "round-robin") == 0) {
self->vars->conf.balancer = h2o_balancer_create_rr();
Copy link
Member

@kazuho kazuho Jan 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relating to my previous comment, I believe that the selected balancer cannot be stored as a value of the context considering the fact that it is an argument to proxy.reverse.url, instead of being a separate configuration directive.

Consider the case where multiple mappings are made, in which only the first one has a balancer attribute.

paths:
  '/foo':
    proxy.reverse.url:
      backends: ...
      balancer: least-conn
  '/foo':
    proxy.reverse.url:
      backends: ...

The mapping for /foo will incorrectly use an instance of a least-conn balancer (shouldn't it be round-robin?). That single instance will be shared with the two proxy definitions (i.e. one for /abc and the other for /foo). I believe that it is also a bug.

@kazuho
Copy link
Member

kazuho commented Jan 11, 2018

Doing my final review, I have gathered some cosmetic changes that I would like to be made in https://github.com/h2o/h2o/tree/kazuho/pr/1361.

Could you please take a look at them? I believe that the only change to code is dd88f55; others are style adjustments and comments.

@zlm2012
Copy link
Contributor Author

zlm2012 commented Jan 11, 2018

Doing my final review, I have gathered some cosmetic changes that I would like to be made in https://github.com/h2o/h2o/tree/kazuho/pr/1361.

Thank you! Those changes look good to me. I'll merge that branch and fix the issues mentioned.

@zlm2012
Copy link
Contributor Author

zlm2012 commented Jan 12, 2018

Configurator is fixed & added integration test for testing multiple balancers.

@kazuho kazuho changed the title Optimization for Load Balancing Multiple load balancing strategies Jan 14, 2018
@kazuho kazuho merged commit 92e0d21 into h2o:master Jan 14, 2018
kazuho added a commit that referenced this pull request Jan 14, 2018
Multiple load balancing strategies
@kazuho
Copy link
Member

kazuho commented Jan 14, 2018

Thank you for your efforts on the PR. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants