-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
954 lines (820 loc) · 63.6 KB
/
Copy pathindex.html
File metadata and controls
954 lines (820 loc) · 63.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
<!DOCTYPE html>
<html lang="en">
<head>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-0K19QDCVYV"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-0K19QDCVYV');
</script>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>TrustGen: On the Trustworthiness of Generative Foundational Models</title>
<meta name="description" content="TrustGen provides comprehensive framework and benchmarks for evaluating AI model trustworthiness across 7 dimensions: truthfulness, safety, fairness, privacy, robustness, machine ethics, and advanced AI risks.">
<meta name="keywords" content="AI trustworthiness, generative AI, foundation models, AI safety, AI ethics, model evaluation, AI benchmark, LLM evaluation, AI reliability, machine ethics, AI privacy, model robustness, AI fairness, AI alignment, trustworthy AI, AI risk assessment, GenFMs, AI governance, AI testing framework, AI security, responsible AI, AI transparency, AI accountability, AI compliance, AI performance metrics, AI safety standards, deep learning safety, AI model testing, ethical AI development, AI trust framework, AI validation">
<!-- Favicons -->
<link href="assets/img/favicon.png" rel="icon">
<link href="assets/img/apple-touch-icon.png" rel="apple-touch-icon">
<!-- Fonts -->
<link href="https://fonts.googleapis.com" rel="preconnect">
<link href="https://fonts.gstatic.com" rel="preconnect" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Roboto:ital,wght@0,100;0,300;0,400;0,500;0,700;0,900;1,100;1,300;1,400;1,500;1,700;1,900&family=Poppins:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&family=Raleway:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&display=swap" rel="stylesheet">
<link href="https://fonts.googleapis.com/css2?family=Dancing+Script&display=swap" rel="stylesheet">
<link href='https://fonts.googleapis.com/css?family=Poppins' rel='stylesheet'>
<link href="https://fonts.googleapis.com/css?family=Montserrat:800|Crimson+Text:italic|Neuton:300" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Poppins:600|IM+Fell+English:italic|Amethysta:regular" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Hind+Madurai:700|IM+Fell+Double+Pica:italic|Ovo:regular" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Raleway:500|Raleway:600|Frank+Ruhl+Libre:300" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="assets/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="assets/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="assets/vendor/aos/aos.css" rel="stylesheet">
<link href="assets/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<link href="assets/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css" rel="stylesheet">
<!-- Main CSS File -->
<link href="assets/css/main.css" rel="stylesheet">
<link rel="stylesheet" href="http://anijs.github.io/lib/anicollection/anicollection.css">
<!-- SEO Meta Tags -->
<title>TrustGen: Trustworthiness Evaluation Framework for AI & Generative Models | AI Safety Benchmark</title>
<meta name="description" content="TrustGen provides comprehensive framework and benchmarks for evaluating AI model trustworthiness across 7 dimensions: truthfulness, safety, fairness, privacy, robustness, machine ethics, and advanced AI risks.">
<meta name="keywords" content="AI trustworthiness, generative AI, foundation models, AI safety, AI ethics, model evaluation, AI benchmark, LLM evaluation, AI reliability, machine ethics, AI privacy, model robustness, AI fairness, AI alignment, trustworthy AI, AI risk assessment, GenFMs, AI governance, AI testing framework, AI security, responsible AI, AI transparency, AI accountability, AI compliance, AI performance metrics, AI safety standards, deep learning safety, AI model testing, ethical AI development, AI trust framework, AI validation">
<!-- Open Graph Meta Tags -->
<meta property="og:title" content="TrustGen: Complete Framework for AI Model Trustworthiness Evaluation">
<meta property="og:description" content="Comprehensive evaluation framework and dynamic benchmark system for assessing trustworthiness of generative AI models across multiple dimensions.">
<meta property="og:image" content="assets/img/header-logo.png">
<meta property="og:url" content="https://trustgen.github.io/">
<meta property="og:type" content="website">
<!-- Twitter Card Meta Tags -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="TrustGen: AI Model Trustworthiness Evaluation Framework">
<meta name="twitter:description" content="Leading framework for evaluating AI model trustworthiness, safety, and reliability across 7 key dimensions.">
<meta name="twitter:image" content="assets/img/header-logo.png">
<!-- Additional SEO Meta Tags -->
<meta name="author" content="TrustGen Research Team">
<meta name="robots" content="index, follow">
<link rel="canonical" href="https://trustgen.github.io/">
<!-- Schema.org Markup -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "ResearchProject",
"name": "TrustGen",
"description": "Framework and benchmark system for evaluating trustworthiness of generative AI models",
"url": "https://trustgen.github.io/",
"keywords": "AI trustworthiness, generative models, AI safety, model evaluation",
"author": {
"@type": "Organization",
"name": "TrustGen Research Team"
}
}
</script>
<!-- =======================================================
* Template Name: Anyar
* Template URL: https://bootstrapmade.com/anyar-free-multipurpose-one-page-bootstrap-theme/
* Updated: Aug 07 2024 with Bootstrap v5.3.3
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
</head>
<body class="index-page">
<header id="header" class="header d-flex align-items-center fixed-top">
<div class="container position-relative d-flex align-items-center justify-content-between">
<a class="logo d-flex align-items-center me-auto me-xl-0">
<!-- Uncomment the line below if you also wish to use an image logo -->
<!-- <img src="assets/img/logo.png" alt=""> -->
<!-- <h1 class="sitename">Trust</h1> -->
<img src="assets/img/header-logo.png" class="img-fluid" id="logo-img" alt="">
</a>
<nav id="navmenu" class="navmenu">
<ul>
<li><a href="#hero" class="active">Home</a></li>
<li class="dropdown">
<a href="#"><span>Explore</span> <i class="bi bi-chevron-down toggle-dropdown"></i></a>
<ul>
<li><a href="#about">Background</a></li>
<li><a href="#rules">Guideline</a></li>
<li><a href="#dimensions">Dimensions</a></li>
<li><a href="#models">Benchmark</a></li>
<li><a href="#leaderboard">Leaderboard</a></li>
<li><a href="#discussion">Discussion</a></li>
<li><a href="#team">Team</a></li>
</ul>
</li>
<!-- <li><a href="#about">Background</a></li> -->
<!-- <li><a href="#dimensions">Dimensions</a></li>
<li><a href="#models">Models</a></li>
<li><a href="#leaderboard">Leaderboard</a></li>
<li><a href="#discussion">Discussion</a></li> -->
<li><a href="https://trusteval-docs.readthedocs.io/">Docs</a></li>
<li><a href="https://github.com/TrustGen/TrustEval-toolkit">Github</a></li>
<li><a href="https://arxiv.org/abs/2502.14296">Paper</a></li>
<!-- <li><a href="#contact">Contact</a></li> -->
</ul>
<i class="mobile-nav-toggle d-xl-none bi bi-list"></i>
</nav>
<!-- <div class="header-social-links">
<a href="#" class="twitter"><i class="bi bi-twitter-x"></i></a>
<a href="#" class="facebook"><i class="bi bi-facebook"></i></a> -->
<!-- <a href="#" class="instagram"><i class="bi bi-instagram"></i></a>-->
<!-- <a href="#" class="linkedin"><i class="bi bi-linkedin"></i></a>-->
<!-- </div> -->
</div>
</header>
<main class="main">
<!-- Hero Section -->
<section id="hero" class="hero section">
<div class="img-background-container"> <!-- 使用 class 而不是 id -->
<div class="img-background" id="background"></div>
</div>
<video autoplay muted playsinline preload="auto" id="background-video">
<source src="assets/img/background-video - 02-4.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
<div class="container" id="title-container">
<div class="row justify-content-center">
<div class="col-lg-7 text-center" data-aos="fade-up" data-aos-delay="100">
<!-- <div class="col-12 container-type">
<div class="title-text">TrustGen</div>
</div> -->
<!-- <img src="assets/img/header-logo.png" class="img-fluid custom-padding" alt="" style="width: 60%;"> -->
<h2><span>On the </span>Trustworthiness <span>of Generative Foundational Models</span></h2>
<p>Guideline, Assessment, and Perspective</p>
<!-- <a href="#about" class="btn-get-started">Get Started</a>-->
</div>
</div>
</div>
</section><!-- /Hero Section -->
<!-- About Section -->
<section id="about" class="about section">
<!-- Section Title -->
<div class="container section-title" data-aos="fade-up">
<h2>Background</h2>
<!-- <p>Necessitatibus eius consequatur ex aliquid fuga eum quidem sint consectetur velit</p> -->
</div><!-- End Section Title -->
<div class="container" data-aos="fade-up">
<p style="font-family: 'Dancing Script', cursive; font-size: 1.5em;">
"Trust is the glue of life. It's the most essential ingredient in effective communication. It's the foundational principle
that holds all relationships."
</p>
<p style="text-align: right; font-family: 'Dancing Script', cursive; font-size: 1.5em;">
– Stephen R. Covey
</p>
<br>
<img src="assets/img/milestone.png" class="img-fluid my-2" alt="">
<div class="row gy-10 my-4">
<div class="content col-xl-12 d-flex flex-column" data-aos="fade-up" data-aos-delay="100">
<!-- <h3 class=" align-self-center align-self-xl-middle">TL; DR</h3>-->
<p>In this paper, we first establish a standardized set of guidelines that provide a consistent, cross-disciplinary foundation for defining and assessing the trustworthiness of GenFMs. Then, recognizing that current benchmarks focus narrowly on specific model categories and lack a dynamic nature, we introduce <span class="icon-before-trustgen"></span>TrustGen, a comprehensive and adaptive benchmark for evaluating GenFMs across multiple dimensions of trustworthiness. Finally, we provide an in-depth discussion on the topic of trustworthy GenFMs, covering key aspects such as the fundamental nature of trustworthiness, evaluation methodologies, the role of interdisciplinary collaboration, societal and downstream implications, as well as trustworthiness-related technical approaches.
</p>
<img src="assets/img/intro.png" class="img-fluid" alt="" >
<!-- <a href="#" class="about-btn align-self-center align-self-xl-middle"><span>About us</span> <i class="bi bi-chevron-right"></i></a>-->
<!-- Icon Boxes in a Horizontal Layout -->
<div class="row gy-5 my-4" data-aos="fade-up" data-aos-delay="200">
<div class="col-md-4 icon-box position-relative">
<img src="assets/img/compliant.png" alt="guideline icon" style="width: 15%;">
<h4>Thrust 1: A Standardized Set of Guidelines for Trustworthy GenFMs</h4>
<ul class="ps-3" style="list-style-type: disc;">
<li>GenFMs' societal impact demands deeper trust analysis.</li>
<li>Addressing key challenges guides ethical and aligned advancements.</li>
</ul>
</p>
</div><!-- Icon-Box -->
<div class="col-md-4 icon-box position-relative">
<img src="assets/img/framework-1.png" alt="framework icon" style="width: 14%;">
<h4>Thrust 2: Dynamic Evaluation on Trustworthiness of GenFMs</h4>
<ul class="ps-3" style="list-style-type: disc;">
<li>Static evaluations cannot keep pace with evolving GenFMs.</li>
<li>Dynamic frameworks ensure continuous and adaptable trust assessment.</li>
</ul>
</div><!-- Icon-Box -->
<div class="col-md-4 icon-box position-relative">
<img src="assets/img/research.png" alt="analysis icon" style="width: 14%;">
<h4>Thrust 3: In-depth Discussion on Challenges and Future Research</h4>
<ul class="ps-3" style="list-style-type: disc;">
<li>Lack of unified standards causes inconsistent trust practices.</li>
<li>Clear guidelines are essential for trustworthy and reliable GenFMs use.</li>
</ul>
</div><!-- Icon-Box -->
</div>
<!-- Rules Section -->
<section id="rules" class="rules section">
<div class="container">
<div class="row gy-5">
<div class="content col-xl-5 d-flex flex-column" data-aos="fade-up" data-aos-delay="100">
<div class="container section-title" data-aos="fade-up">
<h2>Guidelines of Trustworthy Foundation Models</h2>
</div>
<p class="rules-intro">
To define a set of guidelines to speculate the models' behavior and ensure their trustworthiness, we first establish the following key considerations:
</p>
<ul class="list-unstyled">
<li style="margin-top: 10pt; margin-bottom: 10pt;"><i class="bi bi-check-circle text-primary me-2"></i>Legal Compliance</li>
<li style="margin-top: 10pt; margin-bottom: 10pt;"><i class="bi bi-check-circle text-primary me-2"></i>Ethics and Social Responsibility</li>
<li style="margin-top: 10pt; margin-bottom: 10pt;"><i class="bi bi-check-circle text-primary me-2"></i>Risk Management</li>
<li style="margin-top: 10pt; margin-bottom: 10pt;"><i class="bi bi-check-circle text-primary me-2"></i>User-Centered Design During Application</li>
<li style="margin-top: 10pt; margin-bottom: 10pt;"><i class="bi bi-check-circle text-primary me-2"></i>Adaptability and Sustainability</li>
</ul>
</div>
<div class="col-xl-7" data-aos="fade-up" data-aos-delay="200">
<div class="row gy-4">
<div class="col-12">
<div class="accordion" id="trustModelRules">
<div class="accordion-item">
<h2 class="accordion-header" id="headingOne">
<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapseOne" aria-expanded="true" aria-controls="collapseOne">
Guideline 1: Fairness and Universal Values
</button>
</h2>
<div id="collapseOne" class="accordion-collapse collapse show" aria-labelledby="headingOne" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The generative model should be designed and trained to ensure fairness, uphold universal values, and minimize biases in all user interactions. It must align with fundamental moral principles, be respectful of user differences, and avoid generating harmful, offensive, or inappropriate content in any context.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingTwo">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseTwo" aria-expanded="false" aria-controls="collapseTwo">
Guideline 2: Transparency
</button>
</h2>
<div id="collapseTwo" class="accordion-collapse collapse" aria-labelledby="headingTwo" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The generative model's intended use and limitations should be clearly communicated to users and information that may contribute to the trustworthy model should be transparent.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingThree">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseThree" aria-expanded="false" aria-controls="collapseThree">
Guideline 3: Human Oversight
</button>
</h2>
<div id="collapseThree" class="accordion-collapse collapse" aria-labelledby="headingThree" data-bs-parent="#trustModelRules">
<div class="accordion-body">
Human oversight is required at all stages of model development, from design to deployment, ensuring full control and accountability for the model's behaviors.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingFour">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseFour" aria-expanded="false" aria-controls="collapseFour">
Guideline 4: Accountability
</button>
</h2>
<div id="collapseFour" class="accordion-collapse collapse" aria-labelledby="headingFour" data-bs-parent="#trustModelRules">
<div class="accordion-body">
Developers and organizations should be identifiable and held responsible for the model's behaviors. Accountability mechanisms, including audits and compliance with regulatory standards, should be in place to enforce this.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingFive">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseFive" aria-expanded="false" aria-controls="collapseFive">
Guideline 5: Robustness
</button>
</h2>
<div id="collapseFive" class="accordion-collapse collapse" aria-labelledby="headingFive" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The generative model should demonstrate robustness against adversarial attacks and be capable of properly handling rare or unusual inputs. Continuous updates and testing are necessary to maintain robustness and avoid unpredictable behaviors.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingSix">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSix" aria-expanded="false" aria-controls="collapseSix">
Guideline 6: Harmlessness
</button>
</h2>
<div id="collapseSix" class="accordion-collapse collapse" aria-labelledby="headingSix" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The model should prioritize harmlessness while maximizing its helpfulness, without causing harm or negatively affecting others' assets, including physical, digital, or reputational resources. The model must not generate content that could result in harmful outcomes under any reasonable circumstances involving human interaction.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingSeven">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSeven" aria-expanded="false" aria-controls="collapseSeven">
Guideline 7: Reliability and Accuracy
</button>
</h2>
<div id="collapseSeven" class="accordion-collapse collapse" aria-labelledby="headingSeven" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The model should generate reliable and accurate information, and make correct judgments, avoiding the spread of misinformation. When the information is uncertain or speculative, the model should clearly communicate this uncertainty to the user.
</div>
</div>
</div>
<div class="accordion-item">
<h2 class="accordion-header" id="headingEight">
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseEight" aria-expanded="false" aria-controls="collapseEight">
Guideline 8: Privacy Protection
</button>
</h2>
<div id="collapseEight" class="accordion-collapse collapse" aria-labelledby="headingEight" data-bs-parent="#trustModelRules">
<div class="accordion-body">
The generative model must ensure privacy and data protection, which includes the information initially provided by the user and the information generated about the user throughout their interaction with the model.
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</section><!-- /About Section -->
<!-- dimensions Section -->
<section id="dimensions" class="dimensions section">
<!-- Section Title -->
<div class="container section-title" data-aos="fade-up">
<h2>Trustworthiness Dimensions</h2>
<p>We evaluate the trustworthiness of GenFMs from seven dimensions.</p>
</div><!-- End Section Title -->
<div class="container">
<div class="row gy-3">
<div class="col-lg-4 col-md-7 service-item d-flex" data-aos="fade-up" data-aos-delay="100">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/Truthful.png" alt="Truthfulness Icon" class="section-icon"></div>
<div>
<h4 class="title">Truthfulness</h4>
<p class="description">refers to the ability to provide accurate, factual, and unbiased information while avoiding hallucinations, sycophancy, and deceptive outputs.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div>
<!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="200">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/Safety.png" alt="Safety Icon" class="section-icon"></div>
<div>
<h4 class="title">Safety</h4>
<p class="description">refers to resilience against adversarial attacks, prevention of harmful or toxic outputs, and robustness in maintaining intended functionality even when faced with attempts to manipulate or bypass their safeguards.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="300">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/fairness.png" alt="Fairness Icon" class="section-icon"></div>
<div>
<h4 class="title">Fairness</h4>
<p class="description">refers to the ability to provide unbiased outputs by avoiding stereotypes, disparagement, and preferential treatment across different groups and viewpoints.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="400">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/privacy.png" alt="Privacy Icon" class="section-icon"></div>
<div>
<h4 class="title">Privacy</h4>
<p class="description">refers to the ability to protect sensitive information and prevent unauthorized data extraction, ensuring compliance with privacy standards while resisting privacy attacks such as data extraction and membership inference.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="500">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/robust.png" alt="Robustness Icon" class="section-icon"></div>
<div>
<h4 class="title">Robustness</h4>
<p class="description">refers to the ability to accurately and stably output the input when it receives perturbations.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="600">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/Ethics.png" alt="Machine Ethics Icon" class="section-icon"></div>
<div>
<h4 class="title">Machine Ethics</h4>
<p class="description">is the field focused on embedding ethical principles into autonomous AI systems, enabling them to make decisions that align with moral reasoning and societal values independently.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
<div class="col-lg-4 col-md-6 service-item d-flex" data-aos="fade-up" data-aos-delay="600">
<div class="icon flex-shrink-0 p-2"><img src="assets/img/icons/risk-1.png" alt="Advanced AI Risk Icon" class="section-icon"></div>
<div>
<h4 class="title">Advanced AI Risk</h4>
<p class="description">refers to the potential for autonomous AI systems to pursue goals that conflict with human well-being, raising concerns about uncontrollable scenarios and the need for alignment with ethical and safety standards to protect society.</p>
<!-- <a href="" class="readmore stretched-link"><span>Learn More</span><i class="bi bi-arrow-right"></i></a> -->
</div>
</div><!-- End Service Item -->
</div>
</div>
</section><!-- /Services Section -->
<section id="models" class="models section">
<!-- Section Title -->
<div class="container section-title" data-aos="fade-up">
<h2>Dynamic Benchmark</h2>
</div><!-- End Section Title -->
<div class="row justify-content-center" data-aos="fade-up">
<p>An overview of <span class="icon-before-trustgen"></span>TrustGen, a dynamic benchmark system, incorporating three key components: a metadata curator, a
test case builder, and a contextual variator. It evaluates the trustworthiness of three categories of GenFMs: Text-to-Image Models, Large Language Models, and Vision Language Models across seven trustworthy dimensions with a broad
set of metrics to ensure thorough and comprehensive assessments.</p>
<div class="col-11">
<img src="assets/img/TrustGen_Overview.png" class="img-fluid" alt="">
</div>
</div>
<div class="container" data-aos="fade-up">
<div class="row gy-4" >
<div class="container mt-5">
<h4>Large Language Models</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class="table table-hover mb-0 custom-table" id="llm-table">
<thead>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
<div class="container mt-5">
<h4>Vision Language Models</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class="table table-hover mb-0 custom-table" id="vlm-table">
<thead>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
<div class="container mt-5">
<h4>Text-to-Image Models</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class="table table-hover mb-0 custom-table" id="t2i-table">
<thead>
<tr class="table-primary"> <!-- T2I表格使用浅橙色表头 -->
<th>Model</th>
<th>Model Size</th>
<th>Open-Weight</th>
<th>Version</th>
<th>Creator</th>
<th>Source</th>
<th>Link</th>
</tr>
</thead>
<tbody>
<!-- JavaScript会填充这里 -->
</tbody>
</table>
</div>
</div>
</div>
</div>
</section>
<section id="leaderboard" class="leaderboard section">
<!-- Section Title -->
<div class="container section-title" data-aos="fade-up">
<h2>Leaderboard</h2>
</div><!-- End Section Title -->
<div class="container" data-aos="fade-up">
<div class="row gy-4" >
<div class="container mt-5">
<h4>Large Language Model Results</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class=" table table-hover mb-0 data-table table-green" id="llm-res-table">
<thead>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
<div class="container mt-5">
<h4>Vision Language Model Results</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class=" table table-hover mb-0 data-table table-orange" id="vlm-res-table">
<thead>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
<div class="container mt-5">
<h4>Text-to-Image Results</h4>
<div class="rounded-4 border border-light overflow-hidden">
<table class=" table table-hover mb-0 data-table table-blue" id="t2i-res-table">
<thead>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
</div>
</div>
</section>
<!-- Call To Action Section -->
<!-- <section id="call-to-action" class="call-to-action section dark-background">
<div class="container">
<img src="assets/img/video_bg_00.png" alt="">
<div class="content row justify-content-center" data-aos="zoom-in" data-aos-delay="100">
<div class="col-xl-10">
<div class="text-center">
<a href="" class="glightbox pulsating-play-btn"></a>
<h3>Click to Play</h3>
<p></p>
<a class="cta-btn" href="#">Introduction</a>
</div>
</div>
</div>
</div>
</section> -->
<!-- /Call To Action Section -->
<!-- Team Section -->
<!-- Pricing Section -->
<!-- discussion Section -->
<section id="discussion" class="discussion section">
<!-- Section Title -->
<div class="container section-title " data-aos="fade-up">
<h2>In-Depth Discussion</h2>
</div><!-- End Section Title -->
<div class="container" data-aos="fade-up">
<div class="row">
<div class="col-lg-6" data-aos="fade-up" data-aos-delay="100">
<div class="discussion-container">
<!-- 10.1 -->
<div class="discussion-item">
<h3>Trustworthiness is Subject to Dynamic Changes</h3>
<div class="discussion-content">
<p>Trustworthiness in generative models is increasingly viewed as a dynamic, context-dependent quality that must be adapted to the unique demands and ethical considerations of various applications. In educational settings, for instance, models must prioritize the exclusion of harmful content, aligning with strict standards to protect young audiences. Conversely, in creative, medical, or research domains, the trustworthiness of a model may depend on its capacity for creative freedom or the inclusion of graphic content, challenging the model's adaptability to diverse expectations. Two main approaches address this dynamic trustworthiness: using highly specialized models for specific tasks or designing models capable of context-sensitive adaptations. The latter approach emphasizes adaptive and context-aware governance but raises challenges in model alignment and contextual interpretation. To assess these models accurately, a shift from static evaluation metrics to more fluid, adaptable frameworks is needed, recognizing the varying requirements of different stakeholders.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.2 -->
<div class="discussion-item">
<h3>Trustworthiness Enhancement Should Not Be Predicated On A Loss of Utility</h3>
<div class="discussion-content">
<p>As generative models evolve, balancing trustworthiness with utility has become a critical concern. The California AI Bill (SB 1047), aimed at ensuring model trustworthiness, has sparked debate over whether stringent safeguards might stifle innovation. Research reveals that trustworthiness and utility are closely linked, with models often requiring both to be effective. Prioritizing trustworthiness excessively can limit utility, making models overly restrictive, while focusing solely on utility can lead to issues like bias and ethical risks. Thus, a sustainable approach requires enhancing both aspects in tandem. Approaches like the "harmlessness-first" strategy, which establishes a foundation of trustworthiness before optimizing for utility, exemplify how both dimensions can reinforce each other. This balanced framework aims to produce models that are both reliable and useful across diverse applications, avoiding the pitfalls of sacrificing one for the other.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.3 -->
<div class="discussion-item">
<h3>Reassessing Ambiguities in the Safety of Attacks and Defenses</h3>
<div class="discussion-content">
<p>The difficulty in distinguishing harmful from benign content in generative models creates significant challenges for implementing effective safety mechanisms. Ambiguities in assessing both input and output complicate ethical and practical considerations. For instance, seemingly harmless inputs may subtly encourage harmful outputs, and outputs that include moral disclaimers can still be misused if altered. These ambiguities underscore the need for clearer, more precise definitions and standards for safety in generative models. As these models evolve, addressing such complexities will be vital to their safe and ethical use across different applications.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.4 -->
<div class="discussion-item">
<h3>Dual Perspectives on Fair Evaluation: Developers and Attackers</h3>
<div class="discussion-content">
<p>To evaluate the generative models, it is imperative to address a pivotal yet often overlooked issue: should the evaluation be framed from the standpoint of developers or attackers? From a developer's perspective, evaluations prioritize the model's ethical adherence and its capacity to block harmful interactions entirely, emphasizing moral responsibility and safety. In contrast, attackers assess a model based on its susceptibility to manipulation, where even incorrect or non-responses are deemed failures. Advocating for the developer's perspective in evaluations promotes trustworthiness, focusing on the model's ability to resist exploitation rather than just delivering accurate answers under ideal conditions, thus addressing real-world security challenges.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.5 -->
<div class="discussion-item">
<h3>A Need for Extendable Evaluation in Complex Generative Systems</h3>
<div class="discussion-content">
<p>Current evaluation frameworks are limited in assessing complex generative systems with multiple interacting models and multi-modal outputs. Such systems require advanced evaluation due to (1) dependencies among multiple models, where errors in one model affect others, (2) the need for metrics that assess cross-modality coherence, and (3) challenges in maintaining consistency and scalability as systems grow. Traditional benchmarks fail to capture these dynamics, necessitating new methods that can evaluate inter-model collaboration, multi-modal consistency, and scalability to support the trustworthiness of increasingly intricate generative systems.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.6 -->
<div class="discussion-item">
<h3>Integrated Protection of Model Alignment and External Security</h3>
<div class="discussion-content">
<p>Our work highlight the importance of combining internal alignment mechanisms with external safety protections to enhance the trustworthiness of generative models, particularly large language and vision models. Internal alignment faces limitations, such as inherent flaws in current alignment methods and reduced model utility when safety is overly prioritized. Meanwhile, external protections, like moderators and classifiers, offer additional layers of security by identifying harmful content and adapting to dynamic scenarios without compromising the model's core functionality. This integrated approach promises a safer, more adaptable framework for deploying generative models across diverse applications.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.7 -->
<div class="discussion-item">
<h3>Interdisciplinary Collaboration is Essential to Ensure Trustworthiness</h3>
<div class="discussion-content">
<p>Interdisciplinary collaboration with generative models fosters a mutually beneficial relationship: by integrating insights from fields such as ethics, psychology, and domain-specific expertise, we gain a comprehensive understanding of model trustworthiness, robustness, and reliability. This diverse input not only informs the development of safer and more ethical generative models but also enhances the disciplines involved by enabling trustworthy models to advance research, improve decision-making, and support autonomous systems. This synergy promotes a continuous cycle of innovation, driving progress in both the models themselves and across the broader scientific and technological landscape.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.8 -->
<div class="discussion-item">
<h3>When Generative Models Meets Ethical Dilemma</h3>
<div class="discussion-content">
<p>The integration of generative models in decision-making highlights ethical complexities, as models exhibit varied responses to moral dilemmas. While some models remain neutral, others prioritize utilitarian outcomes or display biases, such as favoring pedestrian safety or emotionally influenced decisions. This diversity reveals the lack of a unified ethical framework, emphasizing the need for interdisciplinary research and transparency tools. Future efforts should focus on aligning models with societal values, especially for applications in sensitive domains like healthcare and law enforcement.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.9 -->
<div class="discussion-item">
<h3>Broad Impacts of Trustworthiness: From Individuals to Society and Beyond</h3>
<div class="discussion-content">
<p>The trustworthiness of generative models has significant individual and societal impacts. On a personal level, these models affect privacy, decision-making, and mental health, with biased outputs perpetuating stereotypes and privacy risks. Societally, they amplify misinformation, economic disruption, and systemic inequality, reinforcing biases and undermining trust in media. Additionally, their high resource demands strain the environment. Ensuring fairness, transparency, and accountability in these models is essential to harness their benefits while minimizing risks.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.10 -->
<div class="discussion-item">
<h3>Alignment: A Double-Edged Sword? Investigating Untrustworthy Behaviors Resulting from
Instruction Tuning</h3>
<div class="discussion-content">
<p>A significant distinction of advanced LLMs is their enhanced ability to follow human instructions, achieved through alignment techniques such as PPO and RLHF. While alignment embeds human values to improve helpfulness and safety, it can also lead to unintended issues, including sycophantic responses, deceptive alignment, overoptimization, and power-seeking behaviors. Mechanistic interpretability offers a promising approach to understanding these behaviors by clarifying model inner workings. Future research should aim to refine alignment methods and develop strategies to mitigate these unintended consequences, enhancing the trustworthiness of large models.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
</div>
</div>
<div class="col-lg-6" data-aos="fade-up" data-aos-delay="200">
<div class="discussion-container">
<!-- 10.11 -->
<div class="discussion-item">
<h3>Lessons Learned in Ensuring Fairness of Generative Foundation Models</h3>
<div class="discussion-content">
<p>Fairness in generative models is complex and context-dependent, requiring adaptations to different groups' needs rather than a one-size-fits-all approach. True fairness goes beyond equal treatment, fostering cross-group understanding and empowering users with unbiased information rather than dictating choices. Fairness must be assessed in both the model's development and its outputs to ensure equitable results. Social disparities complicate fairness further, as strict equality can perpetuate inequity; thus, nuanced adjustments may be necessary. Additionally, models must handle factual statements carefully to avoid subtle disparagement, presenting data in ways that do not reinforce harmful stereotypes.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.12 -->
<div class="discussion-item">
<h3>Balancing Dynamic Adaptability and Consistent Safety Protocols in LLMs to Eliminate
Jailbreak Attacks</h3>
<div class="discussion-content">
<p>To address the vulnerabilities in generative models posed by jailbreak attacks, there is a need for robust safety protocols that ensure consistency across varying inputs. Jailbreaks exploit models' adaptability, using rephrasing and contextual shifts to bypass safety measures. Traditional safety training methods, like fine-tuning or RLHF for safety, focus on specific harmful inputs but are insufficient for handling the vast potential rephrasings that could still produce harmful outputs. A proposed solution is a "multi-level consistency supervision mechanism," which involves training models to generate safe responses for semantically similar queries, regardless of rephrasing, adding a context-sensitive detection module to monitor intent shifts in conversations, and applying post-output defenses to verify adherence to safety protocols in real-time. This comprehensive approach reduces the reliance on input-based safety, strengthens security across diverse contexts, and provides dynamic policy adaptation to maintain safety without compromising the model's utility for various user needs.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.13 -->
<div class="discussion-item">
<h3>The Potential and Peril of LLMs for Application: A Case Study of Cybersecurity</h3>
<div class="discussion-content">
<p>The integration of LLMs into cybersecurity brings both innovation and risk. While frameworks like SWE-bench and Cybench showcase their potential in areas like cryptography and automated testing, LLMs also amplify threats. They can accelerate zero-day exploit discovery, automate advanced social engineering, and generate polymorphic malware that evades detection. Beyond cybersecurity, similar risks arise: LLMs can create synthetic disinformation, fraudulent academic research, and accelerate dual-use technologies in genetic engineering or pharmaceuticals. Current governance efforts, such as OpenAI's Usage Guidelines and Microsoft's AI Framework, remain preliminary. Moving forward, priorities include developing domain-agnostic detection systems, adaptive defense mechanisms leveraging reinforcement learning, and robust red-teaming frameworks to preempt vulnerabilities. The cybersecurity experience highlights the need for comprehensive governance to balance AI's transformative potential with safeguards against misuse.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.14 -->
<div class="discussion-item">
<h3>Trustworthiness of Generative Foundation Models in Medical Domain</h3>
<div class="discussion-content">
<p>Integrating GenFMs into healthcare faces challenges in data quality, explainability, and regulation. Medical data is often noisy, incomplete, and constrained by privacy laws like HIPAA and GDPR, limiting sharing and model generalization. Standardized formats and secure collaborations are needed. Model explainability is vital for clinical trust, as opaque AI systems hinder adoption in high-stakes decisions. Techniques like feature visualization and attention mechanisms aim to enhance transparency without sacrificing performance, enabling collaboration between clinicians and AI. Regulatory frameworks struggle with the dynamic nature of GenFMs. Issues like liability, validation of iterative updates, and accountability require clearer standards. Collaborative efforts among policymakers, developers, and regulators are essential to ensure these models are safe, reliable, and ethically deployed in healthcare.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.15 -->
<div class="discussion-item">
<h3>Trustworthiness of Generative Foundation Models in AI for Science</h3>
<div class="discussion-content">
<p>Generative models in scientific fields like chemistry, biology, and materials science present unique trustworthiness challenges due to the critical need for precision, safety, and ethical compliance. While these models can accelerate discovery, they also risk generating harmful outputs, requiring a balance between innovation and caution. Trust in model outputs relies on transparency, validation, and clear uncertainty measures; human oversight remains essential to assess AI predictions critically. Responsible deployment frameworks, phased testing, ethical constraints, and experimental validation help ensure reliability and safety. This collaborative human-AI approach supports rapid scientific progress while upholding essential standards for safe and ethical application.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.16 -->
<div class="discussion-item">
<h3>Trustworthiness Concerns in Robotics and Other Embodiment of Generative Foundation Models</h3>
<div class="discussion-content">
<p>The integration of large language and vision-language models into robots enhances their processing and recognition abilities but introduces safety risks. LLMs and VLMs can generate hallucinations and misinterpretations, which, when applied to robots, may lead to unsafe actions in the real world. Safety concerns arise in two areas: reasoning and planning, where poor decisions can cause accidents, and physical actions, where incorrect or forceful movements can harm humans or damage objects. Ensuring safety in embodied AGI requires robust control of both cognitive and physical behaviors, with adaptability to unexpected situations.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.17 -->
<div class="discussion-item">
<h3>Trustworthiness of Generative Foundation Models in Human-AI Collaboration</h3>
<div class="discussion-content">
<p>Human-AI collaboration offers transformative potential but raises challenges in trust calibration and accountability. Trust calibration involves balancing overtrust and undertrust in AI outputs. Users' limited understanding of GenFMs, coupled with their opaque nature, complicates this process. Strategies like verbalized confidence scores, uncertainty estimation, and intuitive explainability mechanisms can help users evaluate AI reliability, fostering confident and effective collaboration. Accountability remains a significant concern, especially in error attribution. Determining whether errors stem from AI, humans, or both is complex. Solutions include fine-grained audits, decision pathway logging, and context-aware explanations. Error-aware interfaces that visually trace AI logic and flag issues can promote transparency and critical user engagement. By clarifying responsibility and improving system transparency, trust and accountability are strengthened, ensuring robust, ethical human-AI collaboration in high-stakes applications.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.18 -->
<div class="discussion-item">
<h3>The Role of Natural Noise in Shaping Model Robustness and Security Risks</h3>
<div class="discussion-content">
<p>Robustness is a critical metric for evaluating GenFMs, measuring response consistency under natural perturbations. Our analysis highlights two key considerations: (1) Balancing robustness and overfitting risks. While adversarial training generally improves model stability, excessive optimization can lead to overfitting, reducing generalization to novel perturbations and degrading primary task performance. Balanced approaches are essential to mitigate these risks while leveraging noise benefits. (2) Diftness for prompt types. Models demonstrate higher robustness in close-ended queries, where consistency is crucial for safety-critical applications like autonomous driving or healthcare. Errors in such scenarios can have severe consequences. Open-ended queries, by contrast, tolerate variability, focusing more on coherence and relevance than strict consistency. Addressing these distinct needs ensures robust, reliable performance across both query types, enhancing GenFMs applicability in diverse, high-stakes contexts.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
<!-- 10.19 -->
<div class="discussion-item">
<h3>Confronting Advanced AI Risks: A New Paradigm for Governing GenFMs</h3>
<div class="discussion-content">
<p>The rapid evolution of GenFMs introduces Advanced AI Risks, requiring proactive governance beyond traditional mitigation strategies. (1) Self-replication and autonomy: GenFMs capable of autonomous replication or executing cyberattacks and bioengineering tasks pose catastrophic risks, necessitating safeguards to prevent misuse. (2) Persuasion and manipulation: These models can influence emotions and political opinions, threatening societal integrity and democratic processes. (3) Anthropomorphism: Assigning human-like traits to AI inflates trust, obscures accountability, and fosters misplaced confidence. Addressing these risks demands a comprehensive approach: clarifying AI intent and agency, prioritizing human oversight to ensure AI remains subordinate to human decision-making, and recognizing the systemic nature of advanced risks requiring global cooperation. Frameworks like Anthropic's AI Safety Levels (ASL) categorize models based on potential threats, aligning safety measures with risk tiers. Continuous monitoring and dynamic trustworthiness criteria are essential to preempt vulnerabilities as GenFMs advance.</p>
</div>
<i class="discussion-toggle bi bi-chevron-right"></i>
</div>
</div>
</div><!-- End discussion Column-->
</div>
</div>
</section><!-- /discussion Section -->
<!-- Team Section -->
<section id="team" class="team section">
<div class="container section-title" data-aos="fade-up">
<h2>Our Team</h2>
</div><!-- End Section Title -->
<div class="container">
<div class="swiper init-swiper swiper1">
<div class="swiper-wrapper align-items-center">
<div class="swiper-slide"><img src="assets/img/clients/ND.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UIUC.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/MBZUAI.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/CMU.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/Emory.jpg" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/TAMU.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/lehigh.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/Lucy.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UNC.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UChi.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/allen.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UMiami.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/ASU.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UWM.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/NUS.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/C-CAS.png" class="img-fluid" alt=""></div>
</div>
</div>
</div>
<div class="container">
<div class="swiper init-swiper swiper2">
<div class="swiper-wrapper align-items-center">
<div class="swiper-slide"><img src="assets/img/clients/UW.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/MIT.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UMD.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/UIC.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/IBM.png" class="img-fluid" alt=""></div>
<!-- <div class="swiper-slide"><img src="assets/img/clients/IIT.png" class="img-fluid" alt=""></div> -->
<div class="swiper-slide"><img src="assets/img/clients/USC.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/IBM-ND.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/Cornell.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/salesforce.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/cispa.png" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/microsoft.jpeg" class="img-fluid" alt=""></div>
<div class="swiper-slide"><img src="assets/img/clients/SMU.png" class="img-fluid" alt=""></div>
</div>
</div>
</div>
</section><!-- /Clients Section -->
</main>
<footer id="footer" class="footer dark-background">
<div class="container footer-top">
<div class="row gy-4">
<div class="col-lg-6 col-md-6 footer-about">
<a href="index.html" class="d-flex align-items-center">
<span class="sitename">TrustGen</span>
</a>
<div class="footer-contact pt-3">
<!-- General Email Section -->
<p><strong>General Email:</strong></p>
<p><span>Yue Huang: howiehwong@gmail.com</span></p>
<p><span>Xiangliang Zhang: xzhang33@nd.edu</span></p>
<!-- Toolkit & Benchmark Email Section -->
<p><strong>Toolkit & Benchmark Email:</strong></p>
<p><span>Siyuan Wu: nauyisu022@gmail.com</span></p>
<p><span>Chujie Gao: gaochujie1107@gmail.com</span></p>
</div>
</div>
<div class="col-lg-2 col-md-3 footer-links">
<h4>Useful Links</h4>
<ul>
<li><i class="bi bi-chevron-right"></i> <a href="https://trustgen.github.io/">Home</a></li>
<li><i class="bi bi-chevron-right"></i> <a href="https://arxiv.org/abs/2502.14296">Paper</a></li>
<!-- <li><i class="bi bi-chevron-right"></i> <a href="#">About us</a></li> -->
<li><i class="bi bi-chevron-right"></i> <a href="https://trustgen.github.io/trustgen_docs/">Docs</a></li>
<li><i class="bi bi-chevron-right"></i> <a href="https://github.com/TrustGen">Github</a></li>
</ul>
</div>
<!--
<div class="col-lg-2 col-md-3 footer-links">
<h4>Our Services</h4>
<ul>
<li><i class="bi bi-chevron-right"></i> <a href="#">Web Design</a></li>
<li><i class="bi bi-chevron-right"></i> <a href="#">Web Development</a></li>
<li><i class="bi bi-chevron-right"></i> <a href="#">Product Management</a></li>
<li><i class="bi bi-chevron-right"></i> <a href="#">Marketing</a></li>
</ul>
</div> -->
<!-- <div class="col-lg-4 col-md-12">
<h4>Follow Us</h4>
<p>Cras fermentum odio eu feugiat lide par naso tierra videa magna derita valies</p>
<div class="social-links d-flex">
<a href=""><i class="bi bi-twitter-x"></i></a>
<a href=""><i class="bi bi-facebook"></i></a>
<a href=""><i class="bi bi-instagram"></i></a>
<a href=""><i class="bi bi-linkedin"></i></a>
</div>
</div> -->
</div>
</div>
<div class="container copyright text-center mt-4">
<p>© <span>Copyright</span> <strong class="px-1 sitename">TrustGen</strong> <span>All Rights Reserved</span></p>
<div class="credits">
<!-- All the links in the footer should remain intact. -->
<!-- You can delete the links only if you've purchased the pro version. -->
<!-- Licensing information: https://bootstrapmade.com/license/ -->
<!-- Purchase the pro version with working PHP/AJAX contact form: [buy-url] -->
<!-- Designed by <a href="https://bootstrapmade.com/">BootstrapMade</a> -->
</div>
</div>
</footer>
<!-- Scroll Top -->
<a href="#" id="scroll-top" class="scroll-top d-flex align-items-center justify-content-center"><i class="bi bi-arrow-up-short"></i></a>
<!-- Preloader -->
<div id="preloader"></div>
<!-- Vendor JS Files -->
<script src="assets/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="assets/vendor/php-email-form/validate.js"></script>
<script src="assets/vendor/aos/aos.js"></script>
<script src="assets/vendor/swiper/swiper-bundle.min.js"></script>
<script src="assets/vendor/glightbox/js/glightbox.min.js"></script>
<script src="assets/vendor/imagesloaded/imagesloaded.pkgd.min.js"></script>
<script src="assets/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<!-- Main JS File -->
<script src="assets/js/main.js"></script>
<script src="assets/js/table.js"></script>
<script src="assets/js/anime.js"></script>
<script src="anime.min.js"></script>
</body>
</html>