Diabetic retinopathy (DR) and diabetic macular edema (DME) are leading causes of permanent blindness worldwide. Designing an automatic grading system with good generalization ability for DR and DME is vital in clinical practice. However, prior works either grade DR or DME independently, without considering internal correlations between them, or grade them jointly by shared feature representation, yet ignoring potential generalization issues caused by difficult samples and data bias. Aiming to address these problems, we propose a framework for joint grading with the dynamic difficulty-aware weighted loss (DAW) and the dual-stream disentangled learning architecture (DETACH). Inspired by curriculum learning, DAW learns from simple samples to difficult samples dynamically via measuring difficulty adaptively. DETACH separates features of grading tasks to avoid potential emphasis on the bias. With the addition of DAW and DETACH, the model learns robust disentangled feature representations to explore internal correlations between DR and DME and achieve better grading performance. Experiments on three benchmarks show the effectiveness and robustness of our framework under both the intra-dataset and cross-dataset tests.