Incorporating full three-dimensional models of the reactor core into system transient codes allows for a "best-estimate" calculation of interactions between the core behavior and plant dynamics. Considerable efforts have been made in various countries and organizations on the development of coupled thermal-hydraulic and neutronics codes. Appropriate benchmarks have been developed in international cooperations led by the Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) that permit testing of the neutronics-thermal-hydraulics coupling and verification of the capability of the coupled codes to analyze complex transients with coupled core-plant interactions. Three such benchmarks are presented in this paper - the OECD/U.S. Nuclear Regulatory Commission (NRC) pressurized water reactor main steam line break benchmark, the OECD/NRC boiling water reactor turbine trip benchmark, and the OECD/U.S. Department of Energy/Commissariat à l'Energie Atomique V1000 coolant transient benchmark. To meet the objectives of the validation of best-estimate coupled codes, a systematic approach has been introduced to evaluate the analyzed transients employing a multilevel methodology. Since these benchmarks are based on both code-to-code and code-to-data comparisons, further guidance for presenting and evaluating results has been developed. During the course of the benchmark activities, a professional community has been established, which allowed our carrying out in-depth discussions of different aspects considered in the validation process of the coupled codes. This positive output has certainly advanced the state of the art in the area of coupling research.