Hardware Management/FMFM: Difference between revisions
Jump to navigation
Jump to search
Andrew.jones (talk | contribs) |
|||
Line 55: | Line 55: | ||
===Fleetscale Memory Fault Management Events=== | ===Fleetscale Memory Fault Management Events=== | ||
:- [https://opencompute-org.zoom.us/rec/share/PPU3bkjPRlqsZe8T7K07sqna5R14xSgxrRzn7PrXEUKbOCRH1B6f-12onVe6Gd9p.dZ4x2-95LtD2kOb5?pwd=vytXCp2vY80mI4TZC6bUSBbnsTPFeGuk Jan 16, 2024] | |||
:- [https://opencompute-org.zoom.us/rec/share/7NmO1ubfR58v8Ys8jAaf3RsgoDtdN1BQwCwyCnZ1I4xFh10xbkGBdlxuDglucHHH.WFw1eRxQ_0kK-rpp?pwd=ZlFymeW8GOkVO_0xo7zmcJ58HtGQY__I Jan 2, 2024] | |||
:- [https://opencompute-org.zoom.us/rec/share/GApcoWf05x5U4On1itrQA8jQ5hsv6zeh6xdqp-i-anh0xinll3j2RkscBwf_sP8V.HiLsevH6NH3ZK_a8?pwd=I9p-4Vs5XX0H_h7R-RRKVjch6S6HSM07 Dec 5, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/oKnLt_CgYo48yGfBwjQAUAuag4rK4muB7Lh6PGFGay6oF9g7eviGPiFMpWASMNwn.gUbO3b8ojdSdoCsQ?pwd=RRJvLmNg9oA3P9A06fTN-7_iGilzozO7 Nov 21, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/5_qSlej9J4sY5wCX_teCBO7UpQwvdsUXDL5Rzx09u-qwbtIn51XTMw0jwoY-UNGj.Uv6H-KsLBdyH1fHG?pwd=GACQB_DL1Diq7BenwRg48u1KaVbHAge1 Nov 7, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/ZlSTJbvNoxT7ndWzYfmqWF4xrKTXaZn0gy0L5ro07CdmrpdV1iDLslFgrhxkaCql.vqZYsBF01IXGlMOZ?pwd=fkNIkT5-1NQH7DDOnnqZ8Y10aHfejAI6 Oct 24, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/__bvxQL0qigsxksSANTHv3iUmulv8885k1pLU80UVEHcwg_efBuQRrraCIKOWlyw.WB5JQ9WBt53Gp5fQ?pwd=MlpIrwUtH3sthl7Epaz4fxw9Nj9U0IKn Oct 10, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/5p9Vu5Q_T98Pz5G6q_0TaRkxNbgU4LmlfatJtkN2Vr5Ko98akpaP7BEbqau7Tj-i.HNGH4M5XF50hcY-W?pwd=sX-B1zHuOIP0LTty2jYcJLxMobzwblx- Sep 26, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/c2L7pv9YOi_HZixQL52UIRfFwPxv0i-9-7ApXVcwVoWb0E0T7WIitwYGPK5AEAnR.eIDYGJ7iflwhgE_d?pwd=NA3mPG9p-StXSPxRKqidyTMAolvU-HuW Sep 12, 2023] | |||
:- [https://opencompute-org.zoom.us/rec/share/UUPwWe15sj0IZ1AKz_S84msLpFzV7n2-q6-QwcM03TKdLIhlcSS80DdyuL5ACoPq.RH_DRaCyupWhuwf-?pwd=G7BGOudlMU9ivmPoQqWr0deRncxmpLmk Aug 29, 2023] |
Revision as of 20:48, 23 January 2024
Welcome to the OCP Fleetscale Memory Fault Management (FMFM) WIKI
Fleetscale Memory Fault Management is a Worksteam within the Hardware Management Project.
Leadership
Scope
The FMFM is a workstream about standardization of Fleetscale Memory Fault Management
- Proposed topics:
- Standardize vendor agnostic architecture for memory error handling
- Modularization of inputs from different hardware vendors
- APIs and connections between different modules from different vendors.
- Define the output of each module (failure cause, health information, RAS actions, etc.)
- Standardize memory error telemetry
- Format content for better fleet scale RAS management
- Troubleshooting, FRU replacement policies, etc.
- Coordinate with the broader OCP group to make sure there is a path for this general architecture
Get Involved
Subproject Meets Biweekly on Tuesday from 7-9 am PST
- - Link to the FMFM Calendar
- - Link to the Meeting
- - You can also dial in using your phone : United States: +1 (646) 749-3112 Access Code: 454-746-381
Mailing List
Participate in the discussion:
- - FMFM on OCP Groups.io: FMFM Group Link
- - Subscribe to mailing list
- - Post to mailing list
Review and provide Feedback
Documents
Link to Fleetscale Memory Fault Management (FMFM) Workstream Proposal on Google Drive